-
Notifications
You must be signed in to change notification settings - Fork 7
7. Multilang Support
- This guide is dedicated to support IMAGE in multiple languages.
- Pre-requisite: Knowledge on HuggingFace transformer API & PyTorch, HTTP request/response using JavaScript via
fetch
-then
API and Flask (Python), TypeScript syntax. Knowledge on Regexp is optional but much appreciated. - Server-side components: Translation Service, Handlers, French TTS
- Client-side components: UI localisation, Automatic mode
- TL;DR: The translation service uses large language models from the Helsinki-NLP collection, all based on the Marian training framework. We then use HuggingFace
transformers
API with PyTorch to activate the model for translation. - The detailed documentation about service can be read here: README.md.
- The service takes a request with the following body:
-
segments
: (required) list/array of strings to be translated -
src_lang
: (optional) the source language. By default it's English'en'
-
tgt_lang
: (required) the target language
-
- The service sends a response with the following body:
-
translations
: a list/array of strings translated to the target langugage -
src_lang
, 'tgt_lang`: source and target languages, respectively. This is mostly for debugging purposes.
-
- Most language handling is located within the condition:
if (targetLanguage !== 'en') {
// language handling in here
}
- "Language handling" may refer to the following actions:
- Translate a word, part of a sentence, or a full sentence generated by handlers. Individual string is often referred to as "segment".
- Translate the
description
(also known as the title) of an interpretation.
- Common workflow:
- Use
fetch
API to get the translation frommultilang-support
. - Then, send the translation to
espnet-tts-xx
wherexx
is the target language.
- Use
Depending on the situation where multilang-support might not support the target language, or there is no TTS service available for such language, we simply skip the step and log the error.
- As August of 2023, multilang support is in three handlers:
photo-audio-handler
,autour-handler
, andhigh-charts
handler.
- Both handlers have an
utils.ts
file for our intermediate function to fetch the translation. Their implementations are basically the same:
/**
* Get translation from multilang-support service
* @param inputSegment: array of segments to be translated
* @param targetLang: target language, in ISO 639-1 format
* @returns array of translated segments, correspond to inputSegment
*/
export async function getTranslationSegments(inputSegment: string[], targetLang: string) {
return fetch("http://multilang-support/service/translate", {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
segments: inputSegment,
src_lang: 'en',
tgt_lang: targetLang
}),
}).then((resp) => resp.json())
.then(json => json['translations']);
}
- The translation is done by taking
ttsData
"value"
field, send it tomultilang-support
along with targetLanguage (taken fromlanguage
field of the request). The results is then remapped tottsData[i]["value"]
using a for loop. This also applies for pie-charts ofhigh-charts
, and map interpretation ofautour-handler
.
const translatedValues = await utils.getTranslationSegments(
ttsData.map((x) => x["value"]),
targetLanguage
);
for (let i = 0; i < ttsData.length; i++) {
ttsData[i]["value"] = translatedValues[i];
}
- For TTS, the function
getTTS(..)
inutils.ts
takes a list/array of strings with thetargetLanguage
to get attsResponse
object viafetch
API. This function hides the language handling inserver.ts
. A small addition togetTTS
function to redirect handler to TTS service based on the target language.
// inside getTTS(...) function
if (targetLanguage === "en") {
ttsUrl = "http://espnet-tts/service/tts/segments";
} else if (targetLanguage === "fr") {
ttsUrl = "http://espnet-tts-fr/service/tts/segments";
}
else {
// unable to generate speech due to unavailable TTS service
}
-
Depending on the type of HighCharts, we process them differently.
- Line chart: the interpretation is a single sentence; therefore, we send the graphInfo directly to multilang using [what I like to call] one-line translation for both the interpretation and the description (translation title):
graphInfo = (await utils.getTranslationSegments([graphInfo], targetLanguage))[0];
description = (await utils.getTranslationSegments([description], targetLanguage))[0];
-
Pie chart: Similar language handling steps as
photo-audio-handler
andautour-handler
, mentioned above
- Without having
utils.ts
like photo-audio-handler, all language handlings are located within the handler. - Based on the target language, we redirect the TTS stage to the correct HTTP address of the service.
- For the translation, we send
segments
(a list of strings describing locations on the map) withtargetLanguage
tomultilang-support
. Notice that here we combine thedescription
with thesegments
into one array to effectively save processing time.
translateSegments.push(description);
translateSegments.push(...segments);
/* *sending `translateSegments` to translate* */
// Mapping description & segments
description = translated[0];
// Replace `segments` with translated segments
for(let i = 1; i < translated.length; i++) {
segments[i - 1] = translated[i];
}
- All client-visible elements such as texts, labels, or buttons, must be enclosed inside an HTML element with the class
_localisation_
(Canadian spelling!). The element's ID should be the message title/name on the locale JSON.
<!-- E.g.: Directly on the HTML -->
<span class="localisation" id="FullRendering"></span>
<option class="localisation" id="automaticLanguage" value="auto"></option>
// E.g.: JavaScript/TypeScript createElement
const aButton = document.createElement("button");
/* adding attributes, eventlistener, etc. */
aButton.classList.add("localisation");
-
To automatically retrieve locale message from
\_locale/fr/messages.json
, importqueryLocalisation
fromutils.ts
and add it to the end (or near end) of the script. It will find all elements with classlocalisation
in the DOM and match the message's name with the element's ID, and add the message's content to the webpage. -
Messages are added to
messages.json
file inside the language folder. For example, English messages are located in\_locales/en/messages.json
. Although it's not explicitly enforced, I suggest applying CamelCase syntax for the element's name. Description field is optional.
{
"extensionName": {
"message": "IMAGE Extension",
"description": "Extension name"
},
}
-
NOTE:
- To get browser's UI language directly, see: i18n.getUILanguage()
let UILang = browser.i18n.getUILanguage();
let UILangCode = UILang.slice(0, 2);
- In the case of
high-charts
where the JavaScript is injected to the DOM instead of running in the background viabrowser
API, it's temporary usingnavigator.language
to detect display language for buttons. Also, since we can't usegetMessage()
function of the API, it's the only component that's hardcoded like below:
if (navigator.language === 'fr')
chartButtonText = "Interpréter ce graphique avec IMAGE";
else
chartButtonText = "Interpret this chart with IMAGE";
- This functionality works differently depending on user's operating system (OS) and browser. Note that there are UI language and renderings (IMAGE results) language.
- Windows:
- UI and Rendering language is based on the browser settings.
- Tested on: Vivaldi, Microsoft Edge, Opera
- Linux:
- Debian-based (Ubuntu, ChromeOS, etc.)
- Arch-based:
- Vivaldi: UI Language is based on browser settings; Rendering language is based on system's
LANGUAGE
variable - Chromium, Google Chrome (AUR): both UI and renderings language is based on system's
LANGUAGE
variable
- Vivaldi: UI Language is based on browser settings; Rendering language is based on system's
- MacOS/iOS: Work in progress, support coming soon!
- Windows:
- As of August 2023, the IMAGE project has English and French TTS engines available.
- TTS couldn't pronounce numbers
- See French TTS
espnet-tts-fr
struggles with numerical numbers. - Patched by PR Helping French TTS to pronounce numbers using Regex filtering
- See French TTS
- Language Model produces repetitive segments
- See Segment repetition in French renderings.
- Note from @notkaramel: I've seen this bug before in some other languages back in the evaluation days, but very minor and edge case. A probable solution could be fine-tune the model, or change the exisitng model's parameters.
- HighCharts button on the browser extension is hardcoded to display depending on
navigator.language
result. This is an exception since the script doesn't have access to thebrowser
API to query localisation messages.