Skip to content
Antonio Rojas Castro edited this page Sep 3, 2019 · 8 revisions

Welcome to the Recogito-TEI wiki!

This page is meant to collect the main suggestions and decisions taken after discussing them on our Issues page.

Context

When exporting a .txt file as TEI, a select set of metadata fields are converted to header tags. This needs further documentation.

When exporting a .tei.xml (i.e. originally uploaded as TEI) as TEI, Recogito makes no change to whatever was in the original TEI header fields. Exporting the file will return the original header as it was.

When importing a .tei.xml file, Recogito will not parse the original header, nor pick that information up for the internal metadata (title, author, etc.). It potentially could do that (with a bit of extra development), and merge changes made in the Recogito UI back into the original TEI.

Importing TXT / Exporting TEI

Date

In Recogito, the date element is filled with the value of the (user-defined) "date" metadata field. One problem we found ist that the date element was nested in a p element. Rainer asked in a issue the followign questions: what kind(s) of date(s) do we want to include in the TEI (metadata file, time of upload to Recogito), and how do we best encode them? We agreed that the date should be part of the biblStruct when exporting a TEI file. Rainer implemented the change in the codebase, so this change should go live with the next server update.

Paragraphs

The team suggested that it would be useful to automatically divide the body into p elements. The best way to achieve this would be to insert paragraph tags when the parser finds one or more new line characters after a point. Rainer opened a new ticket in Pelagios repository.

Notes

Recogito stores some information (e.g. responsability ) in note´. These are placed outside the elements they refer to -- as a following-sibling. However, it might be useful to add a @targetattribute pointing to the corresponding@xml:id` to make it easier to identify the reference. Rainer is considering this change.

Places and people

When annotating places and people's names, Recogite represents this ifnormation with placeName and persName and generates automatically a @xml:id. Place names are linked to some external reference with the attribute @ref. The only way to identify two places as the same is through this attribute.

It was discussed if we can safely assume in general that the links are meant to connect the entities (@ref) rather than the annotated instances (@xml:id). We reached the conclusion that it would be good to create listPlace and listPerson elements. This solution would avoid the information in the attributes each time the same entity appears in the text. We could use just the @ref attribute to direct to the @xml:id of the element in the list and identify properly that two or more occurrences in the text refer to the same entity. This change, however, needs to be discussed by Pelagios and it may become a future deployment.

Something similar happens the tags provided by editors when annotating people. At present Recogito stores values in an attribute @ana (for example, ana="Duke,Count,Nobleman,Military"). According to the TEI Guidelines, the value of this attribute should be a pointer that refers to a taxonomy. Rainer will consider the implementation of a taxonomy in the long term; in the meantime, using @type could be solution in order to generate a valid TEI document.

Importing TEI / Exporting TEI

Clone this wiki locally