Add ESLint rule to cap the length of translation strings #3725

sarayourfriend · 2024-01-31T01:24:25Z

Fixes

Description

This PR adds a new ESLint rule to cap the length of translation strings to 40 words.

It is a draft because there are 20 strings that need to be broken up for this rule to pass, and I want to make sure this is an acceptable approach before spending the time to do that (as it is quite tedious and careful work). I also need to add the documentation page for the new rule.

I also added a new utility, utilities/generate_test_locales, which generates strings for the test locales. It keeps existing strings and does not overwrite them, which means we can make manual changes to the automatically generated strings as we see fit, and they will be preserved. I have documented that and more in the i18n documentation page.

Testing Instructions

Check out the new ESLint rule. Run it locally with pnpm run eslint and confirm you see 20 strings failing to pass this in en.json5. Try to break the rule by adding new strings or testing any edge cases you can think of. If you come up with any edge cases, please note them in your review so that I can add them to the unit tests.

Please try the new utilitis/generate_test_locales script locally, following the instructions for use in the i18n documentation page.

Checklist

My pull request has a descriptive title (not a vague title likeUpdate index.md).
My pull request targets the default branch of the repository (main) or a parent feature branch.
My commit messages follow best practices.
My code follows the established code style of the repository.
I added or updated tests for the changes I made (if applicable).
I added or updated documentation (if applicable).
I tried running the project locally and verified that there are no visible errors.
[N/A] I ran the DAG documentation generator (if applicable).

Developer Certificate of Origin

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

sarayourfriend · 2024-02-01T02:49:48Z

Looking at the snapshot failures, and it looks like I've messed up a few converstions on the broken-up paragraphs. Redrafting and I'll address those failures tomorrow when I return to work.

sarayourfriend · 2024-02-05T04:10:08Z

@WordPress/openverse-frontend What is the process for updating the RTL testing locale? Do contributors need to manually enter strings into automated translation services or is there a process to generate them as needed (e.g., from a script)?

dhruvkb

Code-wise, LGTM! Personally I don't feel so good about the idea of breaking down translations strings into names like a, b, c because they do not convey any-particular meaning. But I understand it may be the best approach we have.

Is it possible to wrap $t (something like a new util, say $ti) to automatically iterate over a, b, c etc? That would make make this diff...

-   <p>{{ $t("sensitive.faq.two.answerPt2", { openverse: "Openverse" }) }}</p>
+    <p>
+      {{ $t("sensitive.faq.two.answerPt2.a") }}
+      {{ $t("sensitive.faq.two.answerPt2.b") }}
+      {{ $t("sensitive.faq.two.answerPt2.c", { openverse: "Openverse" }) }}
+    </p>

...into something much simpler.

- $t
- $ti

dhruvkb · 2024-02-06T17:20:41Z

frontend/package.json

@@ -124,7 +124,6 @@
    "chokidar": "^3.5.3",
    "comment-json": "^4.2.3",
    "css-loader": "^5.2.7",
-    "eslint-plugin-jsonc": "^2.8.0",


sarayourfriend · 2024-02-06T21:00:18Z

Is it possible to wrap $t (something like a new util, say $ti) to automatically iterate over a, b, c etc? That would make make this diff...

That's an interesting idea. I think the complexity there is that we don't know in the code how many parts there without reading the translation blob. Is it possible to do that? It sounds like a nice improvement but I'd prefer it in a separate PR, considering how significant this one already is.

obulat · 2024-02-08T08:57:46Z

@WordPress/openverse-frontend What is the process for updating the RTL testing locale? Do contributors need to manually enter strings into automated translation services or is there a process to generate them as needed (e.g., from a script)?

The process has usually been the former, I didn't spent time on automating it.

obulat

Works great, thank you!

obulat · 2024-02-26T15:12:46Z

@sarayourfriend, is there something blocking this PR from being merged, aside from the git conflicts?

sarayourfriend · 2024-02-26T20:24:21Z

The translations. It's honestly so tedious, I spent a couple hours last week trying to automate the translations locally using https://github.com/argosopentech/argos-translate, but of course the template strings make automated translations, even approximations impossible.

I was going to ping you today to ask how you deal with variables in strings when you run them through a translator? I don't know Arabic or Russian, so there's no way I can reliably guess where the template words should go.

If you don't worry about it, then I can push the code I wrote to automate it using Argos. There are a lot of outdated/non-existent strings in the testing translations, and hand maintaining these test translations is so tedious, I'm surprised we actually have them, particularly as the process and approach are undocumented.

sarayourfriend · 2024-03-20T00:44:47Z

@obulat I've re-requested a review because I added in the test locale generation script I mentioned to you during our chat earlier this week.

There is new documentation for the utility, and I will update the PR description now to include it in the review instructions. Let me know what you think.

sarayourfriend · 2024-03-20T01:34:16Z

I'm lost on why the translation banner tests fail... when I run them on my computer in the container, they fail there too. But when I run them locally to try to debug why the page isn't loading, it works fine 🤔 It feels like a configuration issue, maybe? I need to look into how the other test locales are handled, if there is anything special that makes them work correctly inside the containers.

obulat · 2024-03-20T13:50:33Z

I'm lost on why the translation banner tests fail... when I run them on my computer in the container, they fail there too. But when I run them locally to try to debug why the page isn't loading, it works fine 🤔 It feels like a configuration issue, maybe? I need to look into how the other test locales are handled, if there is anything special that makes them work correctly inside the containers.

We only use the translated value in the valid-locales.json to determine whether we should show the translation banner or not, we don't look into the locale's json file at all. So, for testing purposes, we can set the translated value to a number below 90 to force the banner to appear, even if the locale is fully-translated.

obulat

I reversed the translation banner test changes because the banner will show up for any language that has translated set to below 90 in the valid-locales.json file.

There is a problem with the current translation generation code: it leaves PLACEHOLDER text instead of converting it back to the original key. This means that when a text has a link inside, this link won't be shown. So, for the translation banner in Russian in this PR didn't have a link, so we wouldn't be able to test that link. That could be a problem in many e2e and VR tests for RTL in the future, if the link is missing.

I tried replacing the PLACEHOLDER texts with the key values. I got stuck trying to figure out why some Arabic texts were not translated (I tried the GUI version of argos and it did translate the untranslated values); some had PLACEHOLDER in them, and some had the original keys (e.g., {openverse}). I guess you only added the translations for keys that didn't exist, and the existing values stayed the same, therefore keeping the original keys, @sarayourfriend ?

I don't want to block this PR because it has so many template changes and I don't want to force you to do any more merge-conflict resolutions.
I think we can remove all locales but ar, which is essential for RTL testing; what do you think?

github-actions · 2024-03-20T14:07:13Z

Full-stack documentation: https://docs.openverse.org/_preview/3725

Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again.

You can check the GitHub pages deployment action list to see the current status of the deployments.

Changed files 🔄:

https://docs.openverse.org/_preview/3725/frontend/reference/i18n.html

sarayourfriend · 2024-03-21T06:01:18Z

Ah! That's an interesting problem.

For those particular keys, we can put a manual translation there and the translation script will keep the previous value. We could, for example, keep the English language value for the translation banner on the Russian one. That should be an easy enough fix!

There is a problem with the current translation generation code: it leaves PLACEHOLDER text instead of converting it back to the original key

This is intention, by the way, because for some languages or strings, Argos will decide to translate the string "PLACEHOLDER" or "{openverse}" (for example) or other placeholders like "###" I tried. I originally had the script keeping track of each of these placeholders and then substituting them back into the original string in order. This actually did work for a great many strings. However, whenever Argos would decide it wanted to translate a particular placeholder (no matter what I tried) it would result in the script not finding enough placeholder values to pop the removed ones back in! There's also no guarantee that the order of the placeholders in the translated text matches that of the original English, for example adjective/noun ordering being opposite in Romance languages from Germanic ones.

So there were lots of issues with trying to maintain the placeholders while using an automated translation tool. I can put back the "kinda working" version, and then for keys that cause an issue in any language, we could output a log file of those and just require manual translations for that subset of keys (with the suggested mangled translation included). Because it always preserves the existing translation, we'd only have to fix it once for each problematic key, while still getting fully working automatic translations with "approximate" placeholders for the vast majority of instances.

I'll work on that tomorrow so I can get this merged ASAP. Thanks very much for the review and the fix on the translation banner tests!

obulat · 2024-03-21T06:57:20Z

I understand the problem with the placeholders, they are really fiddly :)

For those particular keys, we can put a manual translation there and the translation script will keep the previous value. We could, for example, keep the English language value for the translation banner on the Russian one. That should be an easy enough fix!

It's great that we have an escape hatch, I think this is a good enough solution for testing layouts in VR tests.

What do you think about removing the ru and es locales, and only leaving ar for all locale testing, @sarayourfriend ?

sarayourfriend · 2024-03-21T20:31:50Z

I think having as many different scripts as possible is a good idea, if we are exercising them, particularly non-latin scripts, but also locales that have accents and such, so that we exercise that (English won't, most of the time)...

I don't see those test locales as being much of a maintenance overhead at this point, if we have automatic generation. What would be the benefit to removing them?

sarayourfriend · 2024-03-22T05:18:13Z

@obulat if you have time to re-review this changes and merge them, please do. I would like a second set of eyes on this at least, before merging. @dhruvkb if you're able to do so as well, I'd appreciate it.

dhruvkb

This looks good to me! The additional of automatically generated translations is very welcome...

...although I would've liked the all scripting pertaining to translations to be centralised (not necessarily inside frontend/ but together) and co-located.

I didn't go through each modified snapshot (as there are way too many of those) but spot-checked 5-6 and they seem okay 👍.

obulat

Everything looks good ✨

sarayourfriend · 2024-03-26T03:39:25Z

...although I would've liked the all scripting pertaining to translations to be centralised (not necessarily inside frontend/ but together) and co-located.

I was going to do that, but because it's a Python thing, I didn't want to muddy the waters too much.

Signed-off-by: Olga Bulat <[email protected]>

sarayourfriend · 2024-03-26T03:40:16Z

Rebasing just to make sure VR tests pass on main.

github-actions bot added the 🧱 stack: frontend Related to the Nuxt frontend label Jan 31, 2024

sarayourfriend marked this pull request as ready for review February 1, 2024 02:26

sarayourfriend requested a review from a team as a code owner February 1, 2024 02:26

sarayourfriend requested review from AetherUnbound and obulat February 1, 2024 02:26

sarayourfriend marked this pull request as draft February 1, 2024 02:49

obulat removed the 🚦 status: awaiting triage Has not been triaged & therefore, not ready for work label Feb 4, 2024

sarayourfriend force-pushed the add/no-long-translation-strings branch from e8e45c2 to 50577a0 Compare February 5, 2024 04:10

sarayourfriend marked this pull request as ready for review February 5, 2024 22:39

sarayourfriend mentioned this pull request Feb 5, 2024

Update @openverse/eslint-plugin #3737

Merged

1 task

dhruvkb approved these changes Feb 6, 2024

View reviewed changes

obulat approved these changes Feb 8, 2024

View reviewed changes

sarayourfriend requested a review from obulat March 20, 2024 00:43

sarayourfriend force-pushed the add/no-long-translation-strings branch from 50577a0 to a254c21 Compare March 20, 2024 00:44

sarayourfriend requested a review from a team as a code owner March 20, 2024 00:44

sarayourfriend force-pushed the add/no-long-translation-strings branch from a254c21 to 5be4324 Compare March 20, 2024 00:53

obulat force-pushed the add/no-long-translation-strings branch from 13326d6 to 2f01d0e Compare March 20, 2024 13:54

obulat approved these changes Mar 20, 2024

View reviewed changes

sarayourfriend force-pushed the add/no-long-translation-strings branch from 6a88d85 to 21410aa Compare March 22, 2024 05:20

dhruvkb approved these changes Mar 22, 2024

View reviewed changes

obulat approved these changes Mar 25, 2024

View reviewed changes

sarayourfriend and others added 8 commits March 26, 2024 14:39

Add ESLint rule to cap the length of translation strings

7e32483

Fix repeated space stripping from translation comments

382a2f7

Fix typo in rule message

3bbda94

Split too-long translation strings

acef993

Add utility for generating test locales

a17bff7

Update playwright tests for new strings

bd937fe

Replace pt with ru for translation banners

a70d5b2

Signed-off-by: Olga Bulat <[email protected]>

Put back placeholders in lots of strings

7a6fa56

sarayourfriend force-pushed the add/no-long-translation-strings branch from 21410aa to 7a6fa56 Compare March 26, 2024 03:40

sarayourfriend merged commit 6b49eed into main Mar 26, 2024
39 checks passed

sarayourfriend deleted the add/no-long-translation-strings branch March 26, 2024 04:07

obulat mentioned this pull request Apr 8, 2024

Add docs for including machine-generated Arabic translations in e2e tests #3877

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ESLint rule to cap the length of translation strings #3725

Add ESLint rule to cap the length of translation strings #3725

sarayourfriend commented Jan 31, 2024 •

edited

Loading

sarayourfriend commented Feb 1, 2024

sarayourfriend commented Feb 5, 2024

dhruvkb left a comment

dhruvkb Feb 6, 2024

sarayourfriend commented Feb 6, 2024

obulat commented Feb 8, 2024

obulat left a comment

obulat commented Feb 26, 2024

sarayourfriend commented Feb 26, 2024

sarayourfriend commented Mar 20, 2024

sarayourfriend commented Mar 20, 2024

obulat commented Mar 20, 2024

obulat left a comment

github-actions bot commented Mar 20, 2024

sarayourfriend commented Mar 21, 2024 •

edited

Loading

obulat commented Mar 21, 2024

sarayourfriend commented Mar 21, 2024

sarayourfriend commented Mar 22, 2024

dhruvkb left a comment

obulat left a comment

sarayourfriend commented Mar 26, 2024

sarayourfriend commented Mar 26, 2024

Add ESLint rule to cap the length of translation strings #3725

Add ESLint rule to cap the length of translation strings #3725

Conversation

sarayourfriend commented Jan 31, 2024 • edited Loading

Fixes

Description

Testing Instructions

Checklist

Developer Certificate of Origin

sarayourfriend commented Feb 1, 2024

sarayourfriend commented Feb 5, 2024

dhruvkb left a comment

Choose a reason for hiding this comment

dhruvkb Feb 6, 2024

Choose a reason for hiding this comment

sarayourfriend commented Feb 6, 2024

obulat commented Feb 8, 2024

obulat left a comment

Choose a reason for hiding this comment

obulat commented Feb 26, 2024

sarayourfriend commented Feb 26, 2024

sarayourfriend commented Mar 20, 2024

sarayourfriend commented Mar 20, 2024

obulat commented Mar 20, 2024

obulat left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 20, 2024

sarayourfriend commented Mar 21, 2024 • edited Loading

obulat commented Mar 21, 2024

sarayourfriend commented Mar 21, 2024

sarayourfriend commented Mar 22, 2024

dhruvkb left a comment

Choose a reason for hiding this comment

obulat left a comment

Choose a reason for hiding this comment

sarayourfriend commented Mar 26, 2024

sarayourfriend commented Mar 26, 2024

sarayourfriend commented Jan 31, 2024 •

edited

Loading

sarayourfriend commented Mar 21, 2024 •

edited

Loading