-
Notifications
You must be signed in to change notification settings - Fork 27
Localization
Table of Contents generated with DocToc
- Activity-oriented Instructions
- Implementation Specifics
We use html10n.js to implement our app's localization. The following sections are organized around specific app maintenance tasks which involve app literal strings, and describe how to use our localization provisions properly specifically for those tasks.
In our app, localizing strings entails:
-
Editing app literals
- Any changes to app literals need to be applied to the translations
-
Maintaining Translations
- After which we Coordinate with translators to get corresponding updates to our translations
- When new locales are added or existing mappings changes, Maintaining Translations from the wrapped text, as keys, to equivalent, translated strings in each of the locale files.
Adding or editing localized strings is slightly different in code versus in templates, but we use the same approach for both. These examples embody the essentials, for the impatient.
Localized string in code | Localized string in templates |
---|---|
qq("Server changed to [[server]]", {"server": standardServer}); |
<div class="title"> {{= qq("About [[SpiderOak]]", {"SpiderOak": s("SpiderOak")}) }} </div> |
Spanish rendering: Servidor cambiado a spideroak.com |
Spanish rendering: Acerca de SpiderOak Blue |
"Server changed to [[server]]" to "Servidor cambiado a {{server}}"
standardServer having the run-time value "spideroak.com" when the expression is evaluated
|
qq() calls within doT.js 's "evaluate this" {{= / }} brackets, rather than using the library package's tag-attributes method.
s("SpiderOak") , is included by being passed as the value of the {{SpiderOak}} substitution parameter. It resolves in the Blue build to "SpiderOak Blue" .
|
If you're going to edit app literals, you need to be familiar with all this:
-
qq()
is our alias for thehtml10n.get()
API call. - We use this explicit
qq()
API call even in templates- to avoid the mess of interleaving
doT.js
andhtml10n.js
template syntax.
- to avoid the mess of interleaving
-
qq()
's first argument is the string to be localized. It's a key which maps to the localized translation of the string.- About which, see JSON source files.
-
qq()
's key string can include programmatically substituted terms, distinguished by surrounding[[
...]]
brackets.- Accepted characters:
A-Z
,a-z
,_
underscore,-
dash, and.
doT.- Terms surrounded by brackets but including other characters are not recognized for substitution.
- In particular, special terms cannot contain any whitespace.
- The bracketed terms are satisfied from either:
- a mapping passed as the second argument to
qq
, or by - recursive translation of the special term, itself.
- a mapping passed as the second argument to
- Translators must not translate the terms within brackets.
- But the translated strings must use
{{...}}
curly braces instead of square brackets. - The translated-string curly braces are where
html10n.js
actually doess the substitutions of the special terms.
- But the translated strings must use
- Hence:
- In code and in the localization files, all the translation key strings use
[[...]]
square brackets to distinguish special terms - While the localized target strings, and only those, use
{{...}}
curly brace pairs.
- In code and in the localization files, all the translation key strings use
- The bracketed term substitutions serve a few distinct purposes:
- To access program state, like
server
andstatusCode
, pass the value as the localization parameter value.- E.g.:
qq("Service host changed to [[server]]", {"server": newServer})
- E.g.:
- To access brand-specific customizations, like getting the app's brand-specific name in place of "SpiderOak": pass the brand-substitution value,
s(...)
, as the parameter value.- E.g.:
qq("Learn more about [[SpiderOak]]»", {SpiderOak: s("SpiderOak")})
- E.g.:
- To clearly signal to translators the trade strings that should not be translated, like
ShareRoom
andZero-Knowledge
, bracket them.- English-to-English entries for these non-translated strings are collected in a special JSON localization file. This file is not passed to the translators, keeping these localizations English-only - see JSON source files.
- (For very different languages, like ideograph-based ones, we could provide more natively recognizable, language-specific refinements.
- E.g. so they translate to a phrase that includes both the English version and a language-specific equivalent - like, for Chinese,
Zero-Knowledge (零知识)
.)
- E.g. so they translate to a phrase that includes both the English version and a language-specific equivalent - like, for Chinese,
- To access program state, like
- Accepted characters:
The html10n javascript API instructions has further details on our translation machinery.
We mostly only make changes to the en-GB
(www/locales/en-GB.json
) and - for just those strings that have a different US-English translation - in the en-US
(www/locales/en-US.json
) JSON source files, and leave it to the translators to update the respective locale-specific mappings. We also maintain non-translated strings (typically, terms of trade) in www/nontranslate.json
. We occasionally have reason to edit the non-English locale mappings.
In any of these cases, our translators will track the changes in the derived XLIFF externalizations that we send, and provide updates for all the affected strings.
When editing the JSON sources:
- The translators are tracking differences from prior versions, so we can change existing entries in-place, to communicate revisions.
- Add new items just before the last, "NO TRAILING COMMA" item
- Use the same structure as the prior entries (including trailing comma)
- To change existing items, edit them in place
- The translation library will fail in the face of invalid JSON
- When necessary, verify the syntax by pasting a copy of the file into an online JSON validator, like JSON Lint
- If a JSON syntax error does sneak through:
- You'll see some or all of:
-
undefined
showing up in the interface in lots of places, in the stead of proper app strings - Console log messages starting with
Uncaught SyntaxError: ...
- Numerous console log messages:
No translations available (yet)
-
- If you see any of these, use JSON Lint to find the error.
- You'll see some or all of:
- Make sure you're using the right characters for bracketed terms in source (
[[...]]
) vs. target ({{...}}
) strings, for which see Editing app literals - Details. - Once you do get the updates arranged, check them in and go through the process to Convey current versions of strings for translation
See instead Establish a new translation locale, for initiating a new, separate translation locale.
Mostly, we just ask the translators to provide a new locale, based on our English master (en-GB
). Case #2 describes an existing exception, which can serve as a model if we eventually implement others.
- When adding a new, completely distinct locale, we:
- Determine the proper code for the locale. (Use the unqualified code, if you can. Reasons to use a country-qualified code may be best handled by case #2.)
- Ask the translator to send a new XLIFF file for the language code
- Register the derived JSON file in the master index
- (Ensure that local changes to the translation files are committed or otherwise preserved.)
- Include the new XLIFF
.xml
file in thexliff
subdir - Run
grunt xliff:to_json
to produce the derived JSON file.- Also run
grunt xliff:from_json
before committing changes, to normalize the XLIFF version.
- Also run
- Confirm then commit the changes.
- Sometimes, the new locale is just a minor variant on an existing one, with only a few strings that need to be maintained separately. This is so for the
en-US
variant of the Commonwealth English base,en-GB
.- Decide which variant is going to serve as the primary
- For example, we might register
www/locales/es-ES
=>es
andwww/locales/es-MX
=>es-MX
. - Include in the secondary locale's JSON file only those strings which are different than those in the primary.
In either case, the new files will be situated in www/locales
and:
- entries for the new file need to be included in the master index,
www/i18n.json
. Follow the conventions for the existing entries in the new one you create. - How language-variant fall-throughs work:
- Keys that are unqualified codes inherently serve as fall-throughs for any qualified codes that have the unqualified code as their base. Eg:
-
"en"
=>"locales/en-GB.json"
-
"en-US"
=>"locales/en-US.json"
- Thereby, the fall-through for
en-US
=en
=>"locales/en-GB.json"
-
- At least initially, however, most keys are unqualified codes that go to the only variants we have for those languages.
- Keys that are unqualified codes inherently serve as fall-throughs for any qualified codes that have the unqualified code as their base. Eg:
See the code described in Translation facility: html10n.js.
We share our translation data with the translators using XLIFF, and use grunt commands to produce XLIFF externalizations from our JSON sources, and vice versa, to incorporate XLIFF updates from the translators by converting them back to JSON. The instructions below detail procedures using those commands, and include guidance on using version control to best track our changes and avoid inadvertent losses. See Conveying translation updates to and from the Translators for details.
As with most application translations, some of our strings include special terms that have to be treated specially by the translators. The next section, Instructions for the translators, details what the translators need to do to preserve these special strings.
We also have a translation file for non-translated terms, but it is kept separate from the localized files. It is maintained internally, not passed to the translators at all, unless they ask for that list of terms. (They mostly shouldn't need it, since they only see those strings within source strings in brackets. All bracketed terms are to be conveyed unchanged within translation targets.)
- We will provide the translations we already have as XLIFF .xml files
- The parts of the strings that should not be changed in the translation:
- HTML syntax (tags, tag attributes, etc) should be preserved as-is
- The terms within "..." square brackets should be preserved in the source strings, and the translated strings should include the same, untranslated terms, but the brackets surrounding the terms in the translated strings instead be "{{...}}" curly braces.
- Example, in Spanish:
- Source:
Access to link [[url]] failed: [[statusText]] ([[statusCode]])
- Target:
El acceso al enlace {{url}} falló: {{statusText}} ({{statusCode}})
- Source:
- When in doubt, please see already done items for plenty of examples, or feel free to ask us for guidance.
Both sending changes to the translators and receiving updates from them depend on grunt commands we've implemented for converting JSON to XLIFF and back. In both cases we do "round-trip" conversions, from JSON to XLIFF and back, or vice versa. We do this to "normalize" the layout of the data files for checking to version control, to avoid having spurious differences that obscure salient ones.
Our suggested process actually includes two commits for each dispatch to translators, the first to prevent inadvertent loss of changes and the subsequent one to include the final, normalized versions of the changed files.
So, when you have a new version of the app strings ready for dispatch to the translators:
Be sure that any XLIFF changes, from the translators, are already processed and committed before doing this.
- Do a commit, confirming that it includes your intended JSON file revisions.
- See Editing app literals - Details for editing guidance.
- Run
grunt xliff:from_json
to convey the current set of JSON locales,www/locales/*.json
, to the XLIFF equivalents, inxliff/*.xml
.- You could check that the XLIFF files change as expected, though that can be obscured by numerous incidental changes, since the XLIFF process maintains sequence numbers that may change spuriously.
- Run
grunt xliff:to_json
to complete the "round-trip" back onto the original JSON files. This normalizes the JSON files layout to that enforced by the xliff conversion library. - Do a git commit to check in the XLIFF revisions and normalized JSON files.
- Send the entire set of XLIFF .xml files to the translators. They are responsible for tracking the changes in the en-GB.xml file and conveying those changes to the various translations.
We should receive revisions from the translators in the form of XLIFF .xml
files, which we situate in the xliff/
subdirectory and then convert using a custom grunt
command to the corresponding JSON files using, for consumption by the html10n.js
localization library.
Be sure that any locally originated string changes are preserved - eg, by git commit or stashing - before placing updated XLIFF files from translators.
- Situate the recieved XLIFF
.xml
files in thexliff/
subdirectory.- You can examine the changes by doing a git diff, but may see numerous spurious diffs if a the sequence numbering shifts.
- Run
grunt xliff:to_json
to derive the corresponding JSON files, inwww/locales/*.json
.- Now you should be able to examine exactly what has changed using git diff, since there's no sequence numbering and few other incidental artifacts in the files. Order changes or other spurious reorganizations should be rare.
- If the xliff-to-json conversion process fails
- Use a process of elimination to identify which XLIFF file(s) have faulty syntax
- Run the faulty files through an XML lint processor to find the error.
- Report the errors to the translators, so they can shake problems out of their process!
- Test the changes by running a new build on a device where you can set the language.
- You can do so using a security-inhibited browser run of the app - see the debugging note in html10n.js hookup.
- Run
grunt xliff:from_json
to complete the "round-trip" back onto the XLIFF files received from the translators, to normalize them. - Do a git commit to check in the revisions.
Whee!
-
our own fork, for bug fixes and custom tailoring
- Our code for
html10n.js
insrc/helpers/localizer.js
-
window.localizer.prepareHtml10n()
does most of the work- It invoked early in
spiderOakApp.ready()
- Registers the active locales, via
html10n.localize()
- Binds a "go" function to trigger on html10n
localized
event- including setting the
moment.js
locale - and setting some critical document locale characteristics
- including setting the
- It invoked early in
-
- debugging: can use browser developer console to change locale on the fly:
- E.g., while app is running:
html10n.localize(["ru", "en", "nontranslate"])
- The argument must be an array.
- The first array element is for the locality, the second is the fall-through language, and the third is for the non-translated trade-terms, like
Zero-Knowledge
.
- A few things will not be re-rendered until traversing to the login page:
- the menu sheet
- the preliminary page
- some things, like login page, will re-render immediately
- some will require leaving the page and returning
- E.g., while app is running:
AirBnb's polyglot.js looks like a simpler alternative.
- It doesn't do the json file loading or include indirection that we get with
html10n.js
- Initialization looks simpler, which may be a plus and a minus - no inherent event awareness.
-
html10n.js
seems to be working, ok, with our fixes, so this note is just in case of trouble.
The app's locale configuration has some important intricacies.
The html10n.localize()
array argument provides, in effect, a "search-path" for resolving localized strings, with the prospective sources earlier in the path taking precedence over the latter ones. (In actuality, the library accumulates a mapping from translation keys to targets, going in reverse order over the array of codes so that translation entries from items earlier in the array take precedence.)
We prime the html10n machinery (see html10n.js hookup) to use the locale (as reported by the browser), if available, and fall-through to en-GB
. Strings not provided by one translation, in a particular session, may be provided by one that has lower precedence in that session. Some strings are only provided by a special pseudo-translation, included in all sessions.
- The first-priority locale is the one identified by either
navigator.language
ornavigator.userLanguage
, in that order. - Any prospective locales for which we have no translation are skipped. However, if the skipped code is a qualified one (e.g.,
es-MX
), then, by #3, the unqualified code (e.g.es
) is still tried. - Any prospective local that has the form of a qualified code - for example,
en-GB
- is implicitly followed by the unqualified language code.- Thus, e.g.,
es-ES
andes-MX
will both also includees
.
- Thus, e.g.,
- Finally,
www/nontranslate.json
is included at the end of all translation source load paths, so the (identity) resolutions of the strings are available in all cases.
Item 3, implicitly including the non-qualified version of a locale, in effect immediately after the qualified version, enables some important features:
- Because of this, those locales for which we have only one variant are registered under just the unqualified code. At least in this early stage, that applies to most of our locale entries:
de
,es
,fr
,pt
, andru
. - Only our English variants are currently qualified, because we have variants within a major (
en
) code. - Since more English variants derive from the "Commonwealth English"
en-GB
, that is our fall-through locale.- Thus, the
en-GB
fall-through is used for any locations for which we lack any other translation. - Plus, our
en-US
translation actually only includes the strings that differ from theen-GB
tranlsation, since theen-GB
translation, as fall-through, will be used for any strings not satisified byen-US
.
- Thus, the
- Other, non-English translations with multiple variants can use a similar scheme, where the full translation occupies the unqualified slot, and the minor variant is registered with full qualification. For example, if
es-MX
has just a few differences fromes-ES
, thenes-ES
could be registered as justes
, and thees-MX
need include only the strings that are distinct fromes-ES
. Sincees
is implicitly included immediately afteres-MX
, then the items missing fromes-MX
will be resolved viaes
==>es-ES
.
- We edit JSON source files
- The JSON sources are what
html10n.js
uses to produce translations - Some (most) of the JSON sources are used to derive XLIFF externalizations which we use to coordinate with translators
- The JSON sources are what
- Master file:
www/i18n.json
- Index that identifies mappings from locale to actual files.
- In addition to the basic mappings, we use those mappings to identify the primary translation for a locale that has varying country-qualified codes.
- For example,
en-GB
is the preferred code foren
. - This means that we don't have to provide individual aliases for, e.g., the many commonwealth English variants, like
en-AU
,en-CA
, ..., sinceen
is implicitly included, after the country-qualified code, automatically. - And non-commonwealth English variants -
en-US
- needs to only contain the strings that are different from those in theen-GB
collection. Those that are the same will be gotten from theen
entry that is implicitly included, automatically, after theen-US
entry.
- For example,
- The
.json
locale-specific files, used by the app to map app string keys to locality translations- One locale per file
- Located in
www/locales/*.json
- Each file includes a JSON object which maps the English key strings to the respective target locale strings.
- The source and target strings can include bracketed terms, which are to be preserved, unchanged, in the target strings, as described in the Editing app literal - Details section.
- In general we edit only the
en-GB.json
anden-US.json
files. The other locales files are derived from XLIFF files which we get from the translators.- (We do make changes to other JSON locale files when we are alerted to corrections. For them, we need to do a round-trip to XLIFF and back, to comprehensively check in the changes.)
JSON source string nuances:
-
doT.js requires that "/" forward slash is escaped with "" backslash in templates,
- but those "" backslashes aren't seen in the JSON externalizations
- and so aren't needed at all in the source files.
- good thing, because e.g. xliff conversions necessarily drop them,
- which would lead to round-trip discrepancies if they're included in the JSON versions.
-
Non-translated strings file:
www/nontranslate.json
- Any strings which are not supposed to have non-English translations
- E.g.
Zero-Knowledge
andShareRoom
- E.g.
- Generally, terms of trade.
- This file consists of entries for these strings, with the translation being identical to the string itself.
- This file is not included among those from which the translator's XLIFF files are derived.
- The file is included as the last item on the
html10n.localize()
array, so that bracketed references to the non-translated strings are resolved from it. - Hence we don't have to warn the translators not to make exceptions for the non-translated strings - the only instances they see of the strings are bracketed, and they're generally not supposed to translate bracketed strings.
- Any strings which are not supposed to have non-English translations
- Using grunt-xliff.
- Grunt commands to convert JSON to and from XLIFF, XLIFF files to
xliff/
- See
xliff/README.md
for basic usage instructions - including foilbles:
- The 'languages' target doesn't work, as far as I can tell.
- But that's just as well, for our concerns.
- See
-
To produce XLIFF files for translators:
grunt xliff:from_json
- Derive
xliff/*.xml
, one for each .json inwww/locales/*.json
- BEWARE that this will overwrite the corresponding .xml files in
xliff
- be sure you've incorporated the translators last batch of changes before doing this. - Any time you derive the XLIFF files from the JSON ones:
- Before checking in the JSON changes, do the reverse conversion, from XLIFF to JSON (see the next superior outline topic), in order to normalize the JSON sources to the layout produced by the XLIFF conversion. This will reduce superfluous change records.
- Check-in the changes to the XLIFF files, preferably along with checking in the corresponding JSON changes, to keep their histories in sync.
-
To incorporate changes in XLIFF files received from translators:
grunt xliff:from_json
- Derive
www/locales/*.json
, one for eachxliff/*.xml
- BEWARE that this will overwrite the corresponding .json files in
www/locales/*.json
- This means you really want to "round-trip" (see above) and check in any JSON source changes made locally before starting to incorporate changes from translators
- ... so that we have clear audit-trail for internally originated and externally received changes
- ... and can thereby clearly reconcile collisions, when necessary.
-
The XLIFF
.xml
files, for coordinating translations with the translators.- Located in
xliff
subdir - Derived from JSON versions using
grunt xliff:from_json
- Produce
.json
files from XLIFF usinggrunt xliff:to_json
- This will overwrite the JSON versions, so be sure the JSON content is already committed, or already represented in the XLIFF versions (eg, for a round-trip normalization, see next).
- "Round trip normalization": Before committing
.json
changes, e.g. to the English strings, do a round trip conversion from the JSON files to the XLIFF.xml
files then back to JSON.- This way, you can check in the corresponding XLIFF changes simultaneously
- In the process you also normalize the JSON files to the output produced by the XLIFF-to-JSON conversion, thereby avoiding subsequent spurious whitespace differences for the hand-edited lines in the JSON versions.
- Located in