7. Multilang Support

Introduction

This guide is dedicated to support IMAGE in multiple languages.
Pre-requisite: Knowledge on HuggingFace transformer API & PyTorch, HTTP request/response using JavaScript via fetch-then API and Flask (Python), TypeScript syntax. Knowledge on Regexp is optional but much appreciated.
Server-side components: Translation Service, Handlers, French TTS
Client-side components: UI localisation, Automatic mode

1. Translation Service: `multilang-support`

TL;DR: The translation service uses large language models from the Helsinki-NLP collection, all based on the Marian training framework. We then use HuggingFace transformers API with PyTorch to activate the model for translation.
The detailed documentation about service can be read here: README.md.
The service takes a request with the following body:
- segments: (required) list/array of strings to be translated
- src_lang: (optional) the source language. By default it's English 'en'
- tgt_lang: (required) the target language
The service sends a response with the following body:
- translations: a list/array of strings translated to the target langugage
- src_lang, 'tgt_lang`: source and target languages, respectively. This is mostly for debugging purposes.

2. Handlers

Most language handling is located within the condition:

if (targetLanguage !== 'en') {
  // language handling in here
}

"Language handling" may refer to the following actions:
- Translate a word, part of a sentence, or a full sentence generated by handlers. Individual string is often referred to as "segment".
- Translate the description (also known as the title) of an interpretation.
Common workflow:
1. Use fetch API to get the translation from multilang-support.
2. Then, send the translation to espnet-tts-xx where xx is the target language.

Depending on the situation where multilang-support might not support the target language, or there is no TTS service available for such language, we simply skip the step and log the error.

As August of 2023, multilang support is in three handlers: photo-audio-handler, autour-handler, and high-charts handler.

2.1. `photo-audio-handler` and `high-charts`

Both handlers have an utils.ts file for our intermediate function to fetch the translation. Their implementations are basically the same:

/**
 * Get translation from multilang-support service
 * @param inputSegment: array of segments to be translated
 * @param targetLang: target language, in ISO 639-1 format
 * @returns array of translated segments, correspond to inputSegment
*/
export async function getTranslationSegments(inputSegment: string[], targetLang: string) {
    return fetch("http://multilang-support/service/translate", {
        method: "POST",
        headers: {
            "Content-Type": "application/json",
        },
        body: JSON.stringify({
            segments: inputSegment,
            src_lang: 'en',
            tgt_lang: targetLang
        }),
    }).then((resp) => resp.json())
    .then(json => json['translations']);
}

The translation is done by taking ttsData "value" field, send it to multilang-support along with targetLanguage (taken from language field of the request). The results is then remapped to ttsData[i]["value"] using a for loop. This also applies for pie-charts of high-charts, and map interpretation of autour-handler.

const translatedValues = await utils.getTranslationSegments(
    ttsData.map((x) => x["value"]),
    targetLanguage
);
for (let i = 0; i < ttsData.length; i++) {
    ttsData[i]["value"] = translatedValues[i];
}

For TTS, the function getTTS(..) in utils.ts takes a list/array of strings with the targetLanguage to get a ttsResponse object via fetch API. This function hides the language handling in server.ts. A small addition to getTTS function to redirect handler to TTS service based on the target language.

// inside getTTS(...) function
if (targetLanguage === "en") {
    ttsUrl = "http://espnet-tts/service/tts/segments";
} else if (targetLanguage === "fr") {
    ttsUrl = "http://espnet-tts-fr/service/tts/segments";
}
else {
   // unable to generate speech due to unavailable TTS service
}

Depending on the type of HighCharts, we process them differently.
- Line chart: the interpretation is a single sentence; therefore, we send the graphInfo directly to multilang using [what I like to call] one-line translation for both the interpretation and the description (translation title):

graphInfo = (await utils.getTranslationSegments([graphInfo], targetLanguage))[0];
description = (await utils.getTranslationSegments([description], targetLanguage))[0];

Pie chart: Similar language handling steps as photo-audio-handler and autour-handler, mentioned above

2.2. `autour-handler`

Without having utils.ts like photo-audio-handler, all language handlings are located within the handler.
Based on the target language, we redirect the TTS stage to the correct HTTP address of the service.
For the translation, we send segments (a list of strings describing locations on the map) with targetLanguage to multilang-support. Notice that here we combine the description with the segments into one array to effectively save processing time.

translateSegments.push(description);
translateSegments.push(...segments);
/* *sending `translateSegments` to translate* */

// Mapping description & segments
description = translated[0];
// Replace `segments` with translated segments
for(let i = 1; i < translated.length; i++) {
    segments[i - 1] = translated[i];
}

3. Browser Extension

3.1. UI localisation

All client-visible elements such as texts, labels, or buttons, must be enclosed inside an HTML element with the class _localisation_ (Canadian spelling!). The element's ID should be the message title/name on the locale JSON.

<!-- E.g.: Directly on the HTML -->
<span class="localisation" id="FullRendering"></span>
<option class="localisation" id="automaticLanguage" value="auto"></option>

// E.g.: JavaScript/TypeScript createElement
const aButton = document.createElement("button");
/* adding attributes, eventlistener, etc. */
aButton.classList.add("localisation");

To automatically retrieve locale message from \_locale/fr/messages.json, import queryLocalisation from utils.ts and add it to the end (or near end) of the script. It will find all elements with class localisation in the DOM and match the message's name with the element's ID, and add the message's content to the webpage.
Messages are added to messages.json file inside the language folder. For example, English messages are located in \_locales/en/messages.json. Although it's not explicitly enforced, I suggest applying CamelCase syntax for the element's name. Description field is optional.

{
    "extensionName": {
        "message": "IMAGE Extension",
        "description": "Extension name"
    },
}

NOTE:
- To get browser's UI language directly, see: i18n.getUILanguage()

let UILang = browser.i18n.getUILanguage();
let UILangCode = UILang.slice(0, 2);

In the case of high-charts where the JavaScript is injected to the DOM instead of running in the background via browser API, it's temporary using navigator.language to detect display language for buttons. Also, since we can't use getMessage() function of the API, it's the only component that's hardcoded like below:

if (navigator.language === 'fr')
   chartButtonText = "Interpréter ce graphique avec IMAGE";
else
   chartButtonText = "Interpret this chart with IMAGE";

3.2. 'Automatic' mode

This functionality works differently depending on user's operating system (OS) and browser. Note that there are UI language and renderings (IMAGE results) language.
- Windows:
  - UI and Rendering language is based on the browser settings.
  - Tested on: Vivaldi, Microsoft Edge, Opera
- Linux:
  - Debian-based (Ubuntu, ChromeOS, etc.)
  - Arch-based:
    - Vivaldi: UI Language is based on browser settings; Rendering language is based on system's LANGUAGE variable
    - Chromium, Google Chrome (AUR): both UI and renderings language is based on system's LANGUAGE variable
- MacOS/iOS: Work in progress, support coming soon!

4. Text-To-Speech (TTS) Service

As of August 2023, the IMAGE project has English and French TTS engines available.

5. Known Issues

TTS couldn't pronounce numbers
- See French TTS espnet-tts-fr struggles with numerical numbers.
- Patched by PR Helping French TTS to pronounce numbers using Regex filtering
Language Model produces repetitive segments
- See Segment repetition in French renderings.
- Note from @notkaramel: I've seen this bug before in some other languages back in the evaluation days, but very minor and edge case. A probable solution could be fine-tune the model, or change the exisitng model's parameters.
HighCharts button on the browser extension is hardcoded to display depending on navigator.language result. This is an exception since the script doesn't have access to the browser API to query localisation messages.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

7. Multilang Support

Introduction

1. Translation Service: `multilang-support`

2. Handlers

2.1. `photo-audio-handler` and `high-charts`

2.2. `autour-handler`

3. Browser Extension

3.1. UI localisation

3.2. 'Automatic' mode

4. Text-To-Speech (TTS) Service

5. Known Issues

Clone this wiki locally

7. Multilang Support

Introduction

1. Translation Service: multilang-support

2. Handlers

2.1. photo-audio-handler and high-charts

2.2. autour-handler

3. Browser Extension

3.1. UI localisation

3.2. 'Automatic' mode

4. Text-To-Speech (TTS) Service

5. Known Issues

Clone this wiki locally

1. Translation Service: `multilang-support`

2.1. `photo-audio-handler` and `high-charts`

2.2. `autour-handler`