Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IDL for LanguageMap #88

Open
aphillips opened this issue Sep 30, 2024 · 6 comments
Open

Add IDL for LanguageMap #88

aphillips opened this issue Sep 30, 2024 · 6 comments
Assignees

Comments

@aphillips
Copy link
Contributor

Called out by Webauthn in w3c/webauthn#2151

@aphillips aphillips self-assigned this Sep 30, 2024
aphillips added a commit to aphillips/string-meta that referenced this issue Sep 30, 2024
Add definitions for all of the part of LanguageMap
in addition to clean-up of Localizable.
@emlun
Copy link
Member

emlun commented Oct 1, 2024

Commenting on: aphillips@9624fbe

I'm quite confused, because it seems like the language map examples are not consistent with each other or the LanguageMap definition, nor the LanguageMap definition with the prose descriptions:

  • §2.1.4 Language Maps lists Example 4:

    "field-name-goes-here": {
        "en":    {"value": "This is English"},
        "en-GB": {"value": "This is UK English", "dir": "ltr"},
        "fr":    {"value": "C'est français", "lang": "fr-CA", "dir": "ltr"},
        "ar":    {"value": "هذه عربية", "dir": "rtl"}
    }

    Here, it looks like a language map is a simple map with language tags as keys and Localizable-like (but where lang and dir are optional) values. In particular, this example is valid JSON, so presumably this is what a JSON representation of a language map should look like.

  • §6. Localization Considerations defines language indexing and lists Example 14:

    One approach a specification might provide for returning multiple languages of a given field is called language indexing. In language indexing, a given field's value is an array of key-value pairs. [...]

    Example 14

    "title": [ "en": { "value": "Learning Web Design", "lang": "en" },
             "ar": { "value": "التعلم على شبكة الإنترنت التصميم", "lang": "ar",  "dir": "rtl"}, 
             "ja": { "value": "Webデザインを学ぶ", "lang": "ja" },
             "zh-Hans": { "value": "学习网页设计", "lang": "zh-Hans", "dir": "ltr"} ],

    This example is not valid JSON, so one would assume this is rather an abstract example of a sequence of key-value pairs. This is arguably compatible with Example 4 on that abstract level, but it's unclear how any particular serialization of this should look. For JSON the example suggests an array, but not how pairs are represented (Pair arrays:["en", {"value": ...}]? Two-attribute objects: {"key": "en", "value": { "value": ... }}?).

    Examples 15 and 16 are (almost) valid JSON, but clearly incompatible with Example 4:

    Example 15

    "title": [ {
     "de": {"value": "HTML und CSS verstehen", "language": "de-DE" },
     ...
    ],

    Still, it's ambiguous whether each object in the array should have exactly one key or may contain more than one key. Either way I don't understand what would be the benefit of wrapping these objects in an array rather than merging the objects, assuming each key is unique among all objects in the array (and if it's not, what would multiple occurrences of a language tag key mean? How should an application use them?). Is the definition order significant in some way?

  • Finally, A.2 LanguageMap dictionary (not yet published) defines LanguageMap as:

    dictionary LanguageMap {
            DOMString field;
            sequence<LanguageRecord> languageRecord;
    };

    This is unambiguous, but it doesn't agree with the structure in Example 4, and adds an additional object layer around the sequence described in Example 14.

    I also don't understand what is meant by the field member:

    field member
    Identifier for the field containing the Language Map

    Does this mean that field should be set to "languageRecord"? Or that the languageRecord member can be renamed, and field identifies its new name? Or is it the name of the field being localized, i.e., { "some-localizable-string": { "field": "some-localizable-string", "languageRecord": [...] }? I don't understand the purpose of any of those options, so is it something else entirely?

Could you help me understand how these definitions and examples are meant to relate?

The way I interpret the intent of the prose descriptions, Example 4 matches what I would expect. I think the map syntax best expresses the intent of a collection of key-value pairs - and is easiest to work with as a developer - and it doesn't seem useful to use an explicit sequence structure for implementation efficiency, if that is the concern. Maps and sequences most likely take the same time to parse or search in anyway: in JSON, CBOR and XML a plain linear search is needed since the items don't describe their serialization length, while in ASN.1 DER the parser can "skip ahead" in a map just as easily as in a sequence. So without knowledge of any other concerns that went into this design, my expectation for a LanguageMap IDL definition would simply be a record type with language tags as keys and LanguageEntry values:

typedef record<DOMString, LanguageEntry> LanguageMap;

// Or alternatively:
typedef DOMString LanguageTag;
typedef record<LanguageTag, LanguageEntry> LanguageMap;

Either of these would neatly match Example 4 and be easy to work with as a developer.

@aphillips
Copy link
Contributor Author

@emlun Thanks for the comments. The IDL for LanguageMap is incorrect. It should be a record as noted.

Example 14 (and nearby friends) is definitely broken, wrt being valid JSON. I'll fix that also while making the necessary changes.

@emlun
Copy link
Member

emlun commented Oct 2, 2024

Thanks @aphillips! The current design looks good to me.

I spotted a few more minor issues:

@aphillips
Copy link
Contributor Author

This still uses DOMString as the key type in the record definition, but references LanguageTag in the prose description.

It's a bug in Respec. Respec reports an error because LanguageTag is not DOMString. I am considering ignoring the error.

LanguageRecord no longer exists.
Typo: instead of in LanguageMap

Fixed the first, replaced the kbd tags with Respec IDL markup {{LanguageMap}}

@emlun Do you not have review permission on the PR?

@emlun
Copy link
Member

emlun commented Oct 2, 2024

Oh! I didn't realize there was a PR. All I'd seen was the link to this issue from w3c/webauthn#2151, and #89 hasn't shown up in the activity feed in this thread. I'll post in the PR if I find anything else, but I think I'm done with my review for now.

@aphillips
Copy link
Contributor Author

P.S. I added you to the acknowledgements. Appreciate the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants