-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Swedish to all other languages translation process #79
Comments
Hi @andrewtavis, My name is Olushola Ogunkelu. I am a new GSOC contributor. I went through this issue and I would like to work on it. A bit of my background: I have experience contributing to open source and working with Python. But this is my first time working with machine translation. I would appreciate some help on how to get started. Also, do I need to understand Swedish language to work on this? |
Hey @Shorla 👋 We'll be merging in an issue in a few days that will help you work on this. You'll be able to follow the code for the English translations that we have a PR from, and I'm sure that @henrikth93 would be willing to help us check the quality of some of the translations afterwards! I'll be in touch when you can start working on this, but for now I'll assign you 😊 |
Thank you! I can't wait. |
Hey @Shorla 👋 The process has been set up and we're ready to implement here :) It's actually quite streamlined now. If you make a version of scribe_data/extract_transform/languages/English/translations/translate_words.py that replaces |
Thank you! I will get right to it |
Terms
Description
The goal of this issue is to create a process whereby a single file is used to translate all words within Swedish/translations/words_to_translate.json to all other Scribe languages. To achieve this we'll be using m2m100_418M, with the output being a JSON file that has a string and keyed values for each language. This can then be transferred to an SQLite database table with each string in an index corresponding to a column value for each language.
Of specific importance is trying to get a metric of the accuracy of the translation and doing a cutoff such that we're no longer including low quality translations in Scribe applications :)
Contribution
Happy to work on this or support someone with interest in working on it!
The text was updated successfully, but these errors were encountered: