Skip to content

Commit

Permalink
Added from-language to API requests for better translation quality
Browse files Browse the repository at this point in the history
  • Loading branch information
antonxt committed May 18, 2021
1 parent 060eee2 commit 34c0ca4
Show file tree
Hide file tree
Showing 2 changed files with 99 additions and 61 deletions.
32 changes: 20 additions & 12 deletions endpoint/webchat/translate/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,35 +5,43 @@ With this Transformer function, Cognigy.AI is able to translate messages that we

## Configuration

- `TRANSLATOR` - which API to use, can be 'google' or 'microsoft'
- `TRANSLATOR_API_KEY` - the API Key of the selected API
- `FLOW_LANGUAGE` - the language of the Cognigy.AI Flow in a two-letter code format, e.g. 'en', 'de', 'fr'. Read more: [Google Translate](https://cloud.google.com/translate/docs/languages), [Microsoft Translate](https://docs.microsoft.com/en-us/azure/cognitive-services/translator/language-support)
- `AUTO_DETECT_LANGUAGE` - should the Transformer detect the user messages language automatically, which is returned by the translation API. Can be `true` or `false`
- `NO_TRANSLATE_PREFFIX` - a prefix text, which is added before the postback of the buttons. Later, if the user clicks on a button, the prefix is automatically removed and the postback text is handled directly to the flow without being translated.
- `TRANSLATOR` - which API to use, can be `google` or `microsoft`
- `TRANSLATOR_API_KEY` - API Key for the `TRANSLATOR`
- `FLOW_LANGUAGE` - language of the Cognigy.AI Flow. A two-letter code, e.g. 'en', 'de', 'fr'. Read about the supported languages in the API documentation: [Google Translate](https://cloud.google.com/translate/docs/languages), [Microsoft Translate](https://docs.microsoft.com/en-us/azure/cognitive-services/translator/language-support)
- `AUTO_DETECT_LANGUAGE` - should the transformer detect the user anguage automatically. Can be `true` or `false`. Read more below
- `NO_TRANSLATE_PREFFIX` - a prefix text, which is added before the postback of the buttons. Later, when the user clicks on a button, the prefix is automatically removed and the postback text is handled to the flow without being translated

## Option 1: Auto Detect Language
## Option 1: `AUTO_DETECT_LANGUAGE = true`

When this option is on (`true`), the selected Translation tool detects the language of the user's input message automatically and stores it in Cognigy.AI in order to answer in the same langauge as well.
The translator detects the user language automatically from their messages. The detected language is then used to translate user messages to the `FLOW_LANGUAGE` and bot messages to the detected user language. The language is stored in the session under `detectedLanguage`. The value is overwritten every time when the user eneters new message. As the result, the user can start talking in one langauge and then switch to another, after which the new language will be used.

**Example:**

<img src="./docs/autoDetectLanguageExample.PNG" width="400">

## Option 2: User Defines Language
## Option 2: `AUTO_DETECT_LANGUAGE = false`

If the user should be able to select a specific language, the `AUTO_DETECT_LANGUAGE` value should be off (`false`). In this case, the Cognigy.AI Flow needs to send the information about the preferred language by using a so-called DATA_ONLY message:
The Cognigy.AI Flow needs to send the preferred language by using a data-only message with the following payload:

```json
{
"language": "de"
}
```

<img src="./docs/userDefinesLanguageExample.PNG" width="400">

Since the message doesn't contain a text, it is not displayed in the chat window. Thereforem, the transformer can use the value and store it internally in order to translate further messages into, in this case, *German* (de) -- for example.
Since the message doesn't contain a text, it is not displayed in the chat. The transformer intercepts the `language` value and stores it in the session under `language`. This value is then used to translate all bot and user messages until the flow sends another data payload with the new `language` value.

**Important!**

Please make sure that the sent `language` value fits the schema of the selected Translation tool. In order to read more about this, take a look into the linked pages in the `FLOW_LANGUAGE` configuration at the top.
The `language` value must be one of the accepted by the selected `TRANSLATOR`. See `FLOW_LANGUAGE` above for the related documentation links.

## Supported Message Types

The transformer supports the following message types from the bot to the user:

- Text
- Text with Quick Replies
- Text with Buttons
- Gallery
- Gallery
128 changes: 79 additions & 49 deletions endpoint/webchat/translate/transformer.ts
Original file line number Diff line number Diff line change
@@ -1,70 +1,80 @@
const TRANSLATOR: string = 'google' // 'google' or 'microsoft'
const TRANSLATOR_API_KEY = 'YourApiKey' // TRANSLATOR API key
const TRANSLATOR: string = '' // 'google' or 'microsoft'
const TRANSLATOR_API_KEY = '' // TRANSLATOR API key

const FLOW_LANGUAGE = 'en' // Langauge of the flow, from which bot messages and to which user messages are translated
const AUTO_DETECT_LANGUAGE = true // Should the transformer detect the user language automatically from their last message
const NO_TRANSLATE_PREFFIX = 'NoTranslate:' // Added to postbacks. If the user message starts with it, it is removed from the message and the message is not translated

async function translateGoogle(textArray: string[], userLanguage: string): Promise<any> {
async function translateGoogle(textArray: string[], toLanguage: string, fromLanguage?: string): Promise<any> {
const body = { 'q': textArray, 'target': toLanguage }
if (fromLanguage) {
body['source'] = fromLanguage
}

return await httpRequest({
method: 'POST',
uri: `https://translation.googleapis.com/language/translate/v2?key=${TRANSLATOR_API_KEY}`,
headers: { 'Content-type': 'application/json' },
body: { 'q': textArray, 'target': userLanguage },
body: body,
json: true
})
}

async function translateMicrosoft(textsToTranslate: string[], userLanguage: string): Promise<any> {
let microsoftTextArray = []
textsToTranslate.forEach(textToTranslate => {
microsoftTextArray.push(textToTranslate)
async function translateMicrosoft(textsToTranslate: string[], toLanguage: string, fromLanguage?: string): Promise<any> {
const body = []
textsToTranslate.forEach(text => {
body.push({ 'Text': text })
})

let uri = `https://api.cognitive.microsofttranslator.com/translate?api-version=3.0&to=${toLanguage}`
if (fromLanguage) {
uri += `&from=${fromLanguage}`
}

return await httpRequest({
method: 'POST',
uri: `https://api.cognitive.microsofttranslator.com/translate?api-version=3.0&to=${userLanguage}`,
uri: uri,
headers: {
'Ocp-Apim-Subscription-Key': TRANSLATOR_API_KEY,
'Content-type': 'application/json',
'Accept': 'application/json',
'X-ClientTraceId': uuid.v4().toString()
'Content-type': 'application/json'//,
// 'Accept': 'application/json',
// 'X-ClientTraceId': uuid.v4().toString()
},
body: microsoftTextArray,
body: body,
json: true
})
}

async function translate(textsToTranslate: string[], userLanguage: string): Promise<string[]> {
if (userLanguage === FLOW_LANGUAGE) { // Do not translate e.g. en to en
async function translate(textsToTranslate: string[], toLanguage: string, fromLanguage: string): Promise<string[]> {
if (toLanguage === fromLanguage) { // Do not translate e.g. en to en
return textsToTranslate
}

const translatedTexts = []
if (TRANSLATOR == 'google') {
const googleTranslation = await translateGoogle(textsToTranslate, userLanguage)
const textsTranslated = []
if (TRANSLATOR === 'google') {
const googleTranslation = await translateGoogle(textsToTranslate, toLanguage, fromLanguage)
googleTranslation.data.translations.forEach(element => {
translatedTexts.push(element.translatedText)
textsTranslated.push(element.translatedText)
})
} else if (TRANSLATOR == 'microsoft') {
const microsoftTranslation = await translateMicrosoft(textsToTranslate, userLanguage)
} else if (TRANSLATOR === 'microsoft') {
const microsoftTranslation = await translateMicrosoft(textsToTranslate, toLanguage, fromLanguage)
microsoftTranslation.forEach(element => {
translatedTexts.push(element.translations[0].text)
textsTranslated.push(element.translations[0].text)
})
}

return translatedTexts
return textsTranslated
}

async function translateBotMessage(data: any, userLanguage: string) {
if (data?.text) { // Translate a Text message
const textsTranslated = await translate([data.text], userLanguage)
const textsTranslated = await translate([data.text], userLanguage, FLOW_LANGUAGE)
data.text = textsTranslated[0]
}

const webchatMessage = data?.data?._cognigy?._webchat?.message // A data payload message, can have quick replies, buttons or gallery
if (webchatMessage?.quick_replies) { // Translate a Text with Quick Replies message
let textsToTranslate = []
const textsToTranslate = []
if (webchatMessage.text) { // Text above quick replies
textsToTranslate.push(webchatMessage.text)
}
Expand All @@ -74,7 +84,7 @@ async function translateBotMessage(data: any, userLanguage: string) {
}
})

const textsTranslated = await translate(textsToTranslate, userLanguage)
const textsTranslated = await translate(textsToTranslate, userLanguage, FLOW_LANGUAGE)
if (webchatMessage.text) { // Text message above Quick Replies
const index = textsToTranslate.indexOf(webchatMessage.text)
webchatMessage.text = textsTranslated[index]
Expand All @@ -91,7 +101,7 @@ async function translateBotMessage(data: any, userLanguage: string) {
}

if (webchatMessage?.attachment?.payload?.template_type === 'button') { // Translate a Text with Buttons message
let textsToTranslate = []
const textsToTranslate = []
if (webchatMessage.attachment.payload.text) { // The text above buttons
textsToTranslate.push(webchatMessage.attachment.payload.text)
}
Expand All @@ -101,7 +111,7 @@ async function translateBotMessage(data: any, userLanguage: string) {
}
})

const textsTranslated = await translate(textsToTranslate, userLanguage)
const textsTranslated = await translate(textsToTranslate, userLanguage, FLOW_LANGUAGE)
if (webchatMessage.attachment.payload.text) { // The text above buttons
const index = textsToTranslate.indexOf(webchatMessage.attachment.payload.text)
webchatMessage.attachment.payload.text = textsTranslated[index]
Expand All @@ -117,11 +127,9 @@ async function translateBotMessage(data: any, userLanguage: string) {
})
}

if (data.data?._cognigy?._webchat?.message?.attachment?.payload?.elements) { // Translate a Gallery message
const galleryElements = data.data._cognigy._webchat.message.attachment.payload.elements

let textsToTranslate = []
galleryElements.forEach((element) => {
if (webchatMessage?.attachment?.payload?.elements) { // Translate a Gallery message
const textsToTranslate = []
webchatMessage.attachment.payload.elements.forEach((element) => {
if (element.title) {
textsToTranslate.push(element.title)
}
Expand All @@ -137,9 +145,8 @@ async function translateBotMessage(data: any, userLanguage: string) {
}
})

const textsTranslated = await translate(textsToTranslate, userLanguage)

galleryElements.forEach((element) => {
const textsTranslated = await translate(textsToTranslate, userLanguage, FLOW_LANGUAGE)
webchatMessage.attachment.payload.elements.forEach((element) => {
if (element.title) {
const index = textsToTranslate.indexOf(element.title)
element.title = textsTranslated[index]
Expand All @@ -154,6 +161,9 @@ async function translateBotMessage(data: any, userLanguage: string) {
const index = textsToTranslate.indexOf(button.title)
button.title = textsTranslated[index]
}
if (button.payload) { // Add a prefix to the payload to avoid translation later when it is sent back in a user message
button.payload = `${NO_TRANSLATE_PREFFIX}${button.payload}`
}
})
}
})
Expand All @@ -168,25 +178,45 @@ createSocketTransformer({
const userText = payload.text

let translatedText
if (userText.startsWith(NO_TRANSLATE_PREFFIX)) {
translatedText = userText.substring(NO_TRANSLATE_PREFFIX.length)
if (!userText) {
// There is no text in this message, e.g. this is a data-only message from a webchat extension
translatedText = userText
} else if (userText.startsWith(NO_TRANSLATE_PREFFIX)) {
// Neither translate nor detect language when processing a postback
translatedText = userText.substring(NO_TRANSLATE_PREFFIX.length) // Remove the NO_TRANSLATE_PREFFIX from the user message
} else if (TRANSLATOR === 'google') {
const googleTranslation = await translateGoogle([payload.text], FLOW_LANGUAGE)
// Remember the detected language in the session storage
sessionStorage.detectedLanguage = googleTranslation.data.translations[0].detectedSourceLanguage
let googleTranslation
if (sessionStorage.language && !AUTO_DETECT_LANGUAGE) {
// User language detection is off AND the user language is known
googleTranslation = await translateGoogle([payload.text], FLOW_LANGUAGE, sessionStorage.language)
} else { // Either language detection is off OR the user language is unknown
googleTranslation = await translateGoogle([payload.text], FLOW_LANGUAGE) // Let the translate API detect the language
}

if (googleTranslation.data?.translations[0]?.detectedSourceLanguage) {
// Remember the detected language in the session storage
sessionStorage.detectedLanguage = googleTranslation.data.translations[0].detectedSourceLanguage
}
translatedText = googleTranslation.data.translations[0].translatedText
} else if (TRANSLATOR === 'microsoft') {
const msTranslation = await translateMicrosoft([payload.text], FLOW_LANGUAGE)
// Remember the detected language in the session storage
sessionStorage.detectedLanguage = msTranslation[0].detectedLanguage.language
translatedText = msTranslation[0].translations[0].text
} else {
throw new Error(`TRANSLATOR should be either 'google' or 'microsoft'`)
let microsoftTranslation
if (sessionStorage.language && !AUTO_DETECT_LANGUAGE) {
// User language detection is off AND the user language is known
microsoftTranslation = await translateMicrosoft([payload.text], FLOW_LANGUAGE, sessionStorage.language)
} else { // Either language detection is off OR the user language is unknown
microsoftTranslation = await translateMicrosoft([payload.text], FLOW_LANGUAGE)
}

if (microsoftTranslation[0]?.detectedLanguage?.language) {
// Remember the detected language in the session storage
sessionStorage.detectedLanguage = microsoftTranslation[0].detectedLanguage.language
}
translatedText = microsoftTranslation[0].translations[0].text
}

return {
userId: payload.userId,
sessionId: payload.sessionId, // @ts-ignore
sessionId: payload.sessionId,
text: translatedText,
data: payload.data
}
Expand All @@ -199,7 +229,7 @@ createSocketTransformer({
}

let userLanguage = sessionStorage.language
if (AUTO_DETECT_LANGUAGE && sessionStorage.detectedLanguage) {
if (AUTO_DETECT_LANGUAGE && sessionStorage.detectedLanguage) { // Use the language detected by the translate API in a user message
userLanguage = sessionStorage.detectedLanguage
}

Expand Down

0 comments on commit 34c0ca4

Please sign in to comment.