You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a long document is edited by a user in the frontend, all sentences of the document are being translated again after each edit.
For sentence-level models, we should re-translate only the sentences which have been changed.
Perhaps even a better solution would be to add a cache of recently translated sentences into the API server, so it can be reused by various frontends.
For a given translation direction and model, the cache should include
Last N sentences requested to be translated (and either translated using the backend or retrieved from the cache).
M most frequently translated sentences. This list can be build from the real usage from longer time period or from a monolingual corpus. This list does not need to be updated (frequently).
For document-level models, we could have a similar cache of the (possibly multi-sentence) sequences, which are being sent for translation to the backend. In other words, the cache could be integrated into the load balancer, so it does not need to distinguish whether how many sentences are in a sequence. We just should not introduce the same bug as DeepL, which is using a doc-level model (for en-cs), but caching seems to be on the sentence level.
The text was updated successfully, but these errors were encountered:
When a long document is edited by a user in the frontend, all sentences of the document are being translated again after each edit.
For sentence-level models, we should re-translate only the sentences which have been changed.
Perhaps even a better solution would be to add a cache of recently translated sentences into the API server, so it can be reused by various frontends.
For a given translation direction and model, the cache should include
For document-level models, we could have a similar cache of the (possibly multi-sentence) sequences, which are being sent for translation to the backend. In other words, the cache could be integrated into the load balancer, so it does not need to distinguish whether how many sentences are in a sequence. We just should not introduce the same bug as DeepL, which is using a doc-level model (for en-cs), but caching seems to be on the sentence level.
The text was updated successfully, but these errors were encountered: