Skip to content

Commit

Permalink
Index / Add danish language. (#7736)
Browse files Browse the repository at this point in the history
Co-authored-by: Francois Prunayre <[email protected]>
  • Loading branch information
josegar74 and fxprunayre authored Feb 12, 2024
1 parent f0472bd commit 3bd291d
Show file tree
Hide file tree
Showing 2 changed files with 76 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -249,3 +249,43 @@ By default, the search score is defined as (see `web-ui/src/main/resources/catal
## Language analyzer
By default a `standard` analyzer is used. If the catalog content is english, it may make sense to change the analyzer to `english`. To customize the analyzer see `web/src/main/webResources/WEB-INF/data/config/index/records.json`
To add a new language, check first if Elasticsearch provides a specific analyzer for that language (see https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-lang-analyzer.html). Then configure fields that are multilingual
in `records.json` (eg. adding Danish):
* If the field is used for full text search, use the language analyzer:
```json
{
"textField": {
"match": "*Object",
"mapping": {
"type": "object",
"properties": {
"default": {},
...
"langdan": {
"type": "text",
"analyzer": "danish"
},
```
* If the field is a keyword like organisation name or tag field use type `keyword` (which is required for computing aggregations)
```json
{
"tag": {
"match": "th_*",
"mapping": {
"type": "object",
"copy_to": ["tag"],
"properties": {
"default": {},
...
"langdan": {
"type": "keyword",
"copy_to": [
"any.langdan"
]
},
```
36 changes: 36 additions & 0 deletions web/src/main/webResources/WEB-INF/data/config/index/records.json
Original file line number Diff line number Diff line change
Expand Up @@ -1075,6 +1075,10 @@
"type": "keyword",
"copy_to": ["any.langdut", "organisationName.langdut"]
},
"langdan": {
"type": "keyword",
"copy_to": ["any.langdan", "organisationName.langdan"]
},
"langspa": {
"type": "keyword",
"copy_to": ["any.langspa", "organisationName.langspa"]
Expand Down Expand Up @@ -1162,6 +1166,19 @@
}
}
},
"langdan": {
"type": "text",
"analyzer": "danish",
"copy_to": [
"any.langdan"
],
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": ${es.index.ignore_above}
}
}
},
"langita": {
"type": "text",
"analyzer": "italian",
Expand Down Expand Up @@ -1277,6 +1294,12 @@
"any.langdut"
]
},
"langdan": {
"type": "keyword",
"copy_to": [
"any.langdan"
]
},
"langita": {
"type": "keyword",
"copy_to": ["any.langita"]
Expand Down Expand Up @@ -1462,6 +1485,10 @@
"type": "text",
"analyzer": "dutch"
},
"langdan": {
"type": "text",
"analyzer": "danish"
},
"langita": {
"type": "text",
"analyzer": "italian"
Expand Down Expand Up @@ -1504,6 +1531,12 @@
"any.langdut"
]
},
"langdan": {
"type": "keyword",
"copy_to": [
"any.langdan"
]
},
"langita": {
"type": "keyword",
"copy_to": ["any.langita"]
Expand Down Expand Up @@ -2000,6 +2033,9 @@
"langdut": {
"type": "keyword"
},
"langdan": {
"type": "keyword"
},
"langspa": {
"type": "keyword"
}
Expand Down

0 comments on commit 3bd291d

Please sign in to comment.