-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@xml:lang #151
Comments
The more I think about it, the more I think it's a mistake to follow EAD3 on this one. I don't think that we should ignore https://www.w3.org/TR/xml-i18n-bp/, specifically this recommendation:
I've got the alpha schema set up to use the new attribute names, but I would also like to eventually create a branch of the schema that removes all of those attributes (aside from languagecode and scriptcode) and instead uses the "xml" namespace as intended. Although we could continue to have EAD/S continue to do its own things and ignore best practices, it seems like a bad idea not to make the standard more interoperable with other XML standards like TEI, DITA, DocBook, MODS, etc., all of which use xml:lang, as well as RDF and other data serializations that also seem to have settled around doing the same. Why make it more difficult to move between all of those and require a local mapping to do so (and lose out on built in features in XPath, etc.)? Just my two cents 😄 |
I have to admit that I am still not convinced about the argument's strength to merit newly introducing Assuming that we did, a few additional thoughts:
|
Btw - just found this in the MODS user guide (https://www.loc.gov/standards/mods/userguide/attributes.html#lang): citation starts lang Example
xml:lang Example
citation ends Assuming that we do not want to use both attributes next to each other and given that we've decided to open up the options of how languages could be encoded (i.e. not only IANA, but also the three variations of ISO 639 plus other language encodings), I'd be back at using an attribute of our own rather than going back to |
The TEI guidelines provide a great overview here about how they encode languages: https://www.tei-c.org/release/doc/tei-p5-doc/en/html/CH.html#CHSH (which stresses the following: "For maximal compatibility with existing processes, the identifier for the language must be constructed as in Best Current Practice 47") As time goes on, I grow more convinced that it's better to keep the "xml" namespace in EAC for id, base, lang, and adding space, since I don't really see the need for EAC/D to ignore that convention (and to make it more difficult to share data). In the two examples from MODS, the first won't work, for instance, if I want to use something like the built-in "lang" function from XPath (https://www.w3.org/TR/xpath-functions-31/#func-lang) to determine the language, whereas the second one does. All that said, we've got languageOfElement and scriptOfElement in the development branch of EAC, which aligns it with the path taken by EAD. |
Just as a note: "the path taken by EAD" only means not having introduced the XML namespace when defining EAD3. :-) As for potentially going back on the decision with regard to XML namespace, this would mean:
|
Tested as part of Schema Team's schema testing:
The above applies to both schemas, RNG and XSD. |
@fordmadox , @kerstarno : Please keep the lang attributes as they are: not available in List will be completed |
@SJagodzinski thanks for the confirmation. With this, the attribute is ready. @fordmadox please take note of |
Recommendation of IETF language tags needs to be discussed, also with respect to feedback from the CfC. |
Asked community about use of IETF language tags in EAC-CPF team meeting, 8 Aug 2021: Agreed to recommend the use of IETF language tags in |
Language of Element
Replace
xml:lang
with optional attribute@languageOfElement
with data type NMTOKEN. Use@languageOfElement
in all non-empty elements.Creator of issue
Related issues / documents
Remove xml ns to align with EAD 3 #27
@xml:lang: adopt EAD 3 solution #28
Language codes: adopt EAD 3 solution #29
@scriptCode: remove and adjust tag library for @xml:lang/@lang Attribute #30
EAD3 Reconciliation
Summary: Indicates the language of the content of an element. Content of the attribute should be a code taken from ISO 639-1, ISO 639-2b, ISO 639-3, or another controlled list, as specified in the langencoding attribute in . May be used consistently in a multi-lingual finding aid to specify which elements are written in which language. Available on all non-empty elements.
Data Type: NMTOKEN
Context
@xml:lang XML Language
Summary: Two-letter language code from the IANA registry as dictated by the W3C specification.
Description and Usage: The xml:lang may occur on any element intended to contain natural language content whenever information about the language of the content of this element and its children are needed. xml:lang should be used when the language of the element differs from the Language Code declared in the languageCode attribute on the element within the element. The values in the list are taken from the IANA Registry (http://www.iana.org/assignments/language-subtag-registry). The use of the IANA Registry code for languages in this context is outlined in the W3C specification. The syntax is specified at: http://www.w3.org/International/articles/language-tags/.
Data Type: IANA Registry for language codes.
Solution documentation: agreed solution for TL and guidelines
Summary: Indicates the language of the content of an element. Content of the attribute should be a code taken from ISO 639-1, ISO 639-2b, ISO 639-3, or another controlled list, as specified in the langencoding attribute in
<control>
. May be used consistently in a multi-lingual entities description to specify which elements are written in which language. Available on all non-empty elements.Data Type: NMTOKEN
May occur within:
<abstract>
,<address>
,<addressLine>
,<agencyCode>
,<agencyName>
,<agent>
,<alternativeSet>
,<biogHist>
,<chronItem>
,<chronItemSet>
,<chronList>
,<citedRange>
,<componentEntry>
,<contact>
,<contactLine>
,<conventionDeclaration>
,<date>
,<dateRange>
,<dateSet>
,<description>
,<descriptiveNote>
,<event>
,<eventDateTime>
,<eventDescription>
,<existDates>
,<fromDate>
,<function>
,<functions>
,<generalContext>
,<geographicCoordinates>
,<head>
,<identityId>
,<item>
,<language>
,<languageDeclaration>
,<languageUsed>
,<languagesUsed>
,<legalStatus>
,<legalStatuses>
,<list>
,<localControl>
,<localDescription>
,<localDescriptions>
,<localTypeDeclaration>
,<maintenanceAgency>
,<maintenanceEvent>
,<maintenanceHistory>
,<mandate>
,<mandates>
,<nameEntry>
,<nameEntrySet>
,<occupation>
,<occupations>
,<otherAgencyCode>
,<otherEntityType>
,<otherEntityTypes>
,<otherRecordId>
,<p>
,<part>
,<place>
,<placeName>
,<placeRole>
,<places>
,<recordId>
,<reference>
,<relation>
,<relationType>
,<representation>
,<rightsDeclaration>
,<setComponent>
,<shortCode>
,<source>
,<sources>
,<span>
,<structureOrGenealogy>
,<targetEntity>
,<targetRole>
,<term>
,<toDate>
,<useDates>
,<writingSystem>
Example encoding
The text was updated successfully, but these errors were encountered: