You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A document with 50k words takes about a day to translate on a 16 year old machine. For the past day, I’ve been working on a doc double that size. There is no progress indicator and the target output file is zero in size, so I have no idea how far along the processing is. If I lose power or need to shutdown the machine, the past day of work is lost. Most people have newer hardware but considering the heavy amount of computation needed anyone can be trapped in a task that takes days given enough text to translate.
Workaround: we can split the original doc into manageable pieces. But this is manual labor intensive because we cannot just chop the doc at arbitrary locations. Every piece must not break a paragraph. Then the pieces can be in such high numbers that it’s hard to manage.
So I suggest these improvements:
A progress indicator
A way to send a signal to stop processing gracefully and preserve the work completed so far
A way to resume processing on a giant document later
A way to see and make use of partial translations. E.g. on day 5 of a translation process, a user should be able to access output generated 2 days into the translation.
The text was updated successfully, but these errors were encountered:
You definitely shouldn't be translating whole documents at once, that would take a long time even on a beast of a machine.
This could be considered "low level" library, and the features you are asking for are don't really make sense for what it is.
Everything you are asking for is a fairly easy task to accomplish, but I don't think it really suits the library. Maybe I'm wrong though I just found it an hour ago.
If I were you I'd translate it line by line, or sentence by sentence, like you mentioned. This is easily automated for even a beginner coder, or are you from a non-coding background?
Read the document stream until you reach a period.
Translate the sentence, save to your output stream.
Save the position in a secondary file, so if you need to start later, you can open the file and begin where you left off.
Update your progress after each sentence by checking your current position to the length of the stream.
As for viewing past translations, if you are doing it in chunks you can just open the file to view it.
If you need help automating it, let me know, and we can talk about it.
A document with 50k words takes about a day to translate on a 16 year old machine. For the past day, I’ve been working on a doc double that size. There is no progress indicator and the target output file is zero in size, so I have no idea how far along the processing is. If I lose power or need to shutdown the machine, the past day of work is lost. Most people have newer hardware but considering the heavy amount of computation needed anyone can be trapped in a task that takes days given enough text to translate.
Workaround: we can split the original doc into manageable pieces. But this is manual labor intensive because we cannot just chop the doc at arbitrary locations. Every piece must not break a paragraph. Then the pieces can be in such high numbers that it’s hard to manage.
So I suggest these improvements:
The text was updated successfully, but these errors were encountered: