I-Analyzer has the capacity to display named entities.
In order to display a corpus enriched with named entities, install the Annotated Text plugin of Elasticsearch, following the instructions here.
To determine whether named entities are available for a given corpus, the application checks if a given corpus contains fields ending with :ner
.
If the main content field is called speech
, the field containing named entity annotations should be called speech:ner
. This field should have the following Elasticsearch mapping:
{
'type': 'annotated_text'
}
Moreover, an enriched corpus should contain the following keyword fields:
person:ner-kw
location:ner-kw
organization:ner-kw
miscellaneous:ner-kw
These can be used to search or filter (to be implemented).
To enrich a corpus with named entities, we recommend using the TextMiNER library. This library will read from an existing index and a specified field name. The content of the field is analyzed with the BERT-based models for named entity recognition provided by flair. The library then adds named entities to the annotated_text
field and the keyword fields, as outlined above.