Skip to content
This repository has been archived by the owner on Sep 21, 2023. It is now read-only.

Suggestion: Post 2 versions of the OCR (with and without line breaks) #36

Open
DeflateAwning opened this issue Apr 26, 2021 · 3 comments

Comments

@DeflateAwning
Copy link

I've found that I spend a fair bit of time removing line breaks from the output of this bot when doing transcriptions. If the bot posted two versions of the transcription every time, the transcriber can pick whichever version will work better in that situation/for that post (one version with line breaks, and one without).

An alternative option may be to add a command to make the bot respond with a copy of the OCR without linebreaks.

@Pf441
Copy link
Contributor

Pf441 commented Apr 28, 2021

It could be interesting to try, but I'm a bit concerned about the doubled workload for the bot, especially for very big posts.

(In my personal point of view, I also fear that people would use the "no line breaks" transcribed text and copy-paste it directly, without proof-reading it.)

@DeflateAwning
Copy link
Author

DeflateAwning commented Apr 28, 2021

I'm personally not convinced that making keeping the process less efficient will lead to higher quality, but I hear what you're saying.

As far as bot workload goes, it wouldn't be doubling the workload. Making longer posts should not significantly increase the load on the bot. I assume most of the load on the bot is overhead like monitoring all the posts and performing lookups and logic; not transferring text from the bot to reddit.

@Pf441
Copy link
Contributor

Pf441 commented Apr 30, 2021

Reading the text given by the bot and removing line breaks will be something that will require Regexes, and I'm not sure if the bot can edit the transcribed text before sending it on Reddit.

It also depends if the bot sends the "with line breaks" and "without line breaks" sections in the same message or with two different replies (the last one being the major source of increased workload, at my eyes).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants