Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KDL Schema spec clarifications/typos #315

Open
Gankra opened this issue Nov 6, 2022 · 6 comments
Open

KDL Schema spec clarifications/typos #315

Gankra opened this issue Nov 6, 2022 · 6 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@Gankra
Copy link

Gankra commented Nov 6, 2022

I've been working through implementing a KDL Schema Validator, and I've run into a few questions. I will probably find more over time but this is what I have so far.

TYPOS

  1. ✅ "one info node for that describes the schema itself."
  2. ✅ there is a format "irl" (iri?)
  3. ✅ there is a format "irl-reference" (iri-reference?)

ERRORS / QUESTIONS

  1. ✅ does "zero or more..." imply optionality? Yes?
  2. ⁉️ node.children.tag: suggests it's mandatory and multiple, but schema-schema says 0 or 1?
  3. ❓ prop.children.required: fails to specify default value (true?)
  4. ❌ can enum validations be an empty list? (implying nothing matches?)
    1. ✅ I will make the docs clarify that they must have values
  5. ❓ what are the possible values for type validations? (are these core types like "string" or rich types like "(i32)"?
  6. ✅ the node node has "zero or more children" but the children node is itself a list of children..?
    1. ✅ should just be "optional"
    2. ❓ the "defintions" node should be able to have multiple children though?
  7. % is... problematic with floating point. There are ways to do it, the rust json-schema crate handles it, but want to confirm this pandora's box is intentional.
    1. ✅ zkat proposes dropping multiple to avoid the question
@zkat
Copy link
Member

zkat commented Nov 6, 2022

  • Yes you're correct about the typos
  • 0+ means optional, but potentially many, yes.
  • uhhhh... idk and I'm confused and something doesn't seem right here
  • the spec should probably be more specific about this, but schema-schema says "1+" and I think that's the right thing to do
  • I think this was meant to be the core types? Sorry, it's been a while. What should it be?
  • lol yes, that should probably be "optional"
  • you're right. idk what to do about it. We could just scrap it for now because of its implications with floats? Since KDL doesn't really distinguish between float/non-float as far as numbers go.

@Gankra
Copy link
Author

Gankra commented Nov 6, 2022

Dope, I'll file a PR for some of these when I have better clarity on them. Some more issues:

ISSUES

  1. ❓ schema-schema: creative commons link missing its "rel" (documentation?)
  2. ❓ node.tag: "Validations to apply to the tag of the node." -- shouldn't this be a tag node? (schema-schema says so)

I AM CONFUSED

  1. tag node: "The tag describes the tags allowed in a children block or toplevel document."
    1. children node doesn't use this
    2. ❓ why does this exist / is this different from the children node? or the node node?
    3. ok wait in the schema-schema children does have tag by virtue of inheriting all of document's nodes (with ref)
      1. but that also that grabs a pile of random crap it shouldn't (at very least info)
      2. ❓ was info and definitions supposed to be not-under-the-document-node so that this kind of recursive definition would work?
      3. ❓ or was document supposed to have children?
      4. ❓ actually maybe this inheritance works if it's inverted. document ref's in the contents of children?
  2. ❓ what is the meaning/purpose of rel in a url?
    1. ❓ self?
    2. ❓ documentation?

@Gankra
Copy link
Author

Gankra commented Nov 6, 2022

On ref and id:

  1. ❓ completely overlays the matched node onto this node
    1. ❓ ...including id? that seems... wrong.
  2. ❓ replaces conflicts
    1. ❓ does this only apply to positional values?
    2. ❓ kdl technically supports conflicting copies of key=val, but in a way that is basically an overlay so it's moot..?
    3. ❓ you can always have multiple copies of a node with a given name, so unclear how conflicts could be defined
      1. ❓ I guess meaningful for nodes that are "at most 1"?
  3. ❓ is the behaviour of ref consistent across all node types?
    1. ❓ wording varies, unclear if the net effect is always the same
  4. ❓ should it be required that the source and target are the same kind of node?
    1. ❓ i.e. should values only be allowed to import values?
    2. currently I guess not, schema-schema puns document and children
  5. ❓ what should happen if there are multiple matches?
    1. ❓ error? (yes please)
    2. ❓ merge? (god no)
  6. ❓ are ids really global, or per-node-kind?
    1. ❓ what happens if ids aren't globally unique? error?
    2. ❓ this can't be expressed in the schema-schema, should it?
  7. ❓ ref is defined to be an arbitrary KQL string but it's unclear if there's any practical application of anything but [id="blah"]
    1. not needing to support all of KQL as a subsystem would be... much easier

@Gankra
Copy link
Author

Gankra commented Nov 6, 2022

updated my formatting a bit so it's easier to refer to specific points and I can keep track of where my understanding is

@Gankra
Copy link
Author

Gankra commented Nov 6, 2022

On Schema Referencing

(This is future-facing stuff, not necessarily problems in the current spec)

  1. ❓ There is currently no way for a schema to specify what schema-schema it follows

    1. ❓ or equivalently, what version of the KDL Schema Spec it follows
    2. ❓i think there should be a top-level kdl-schema node to this effect (see point 2 below for why not in info)
    3. ❓ json-schema has a "$schema": "https://..." key for this purpose
      1. Using a URL to specify the version is interesting... mixed on it.
      2. In principle it allows a modicum of forward-compat... except a new schema-schema would presumably be self-referential
      3. If you did specify a schema-schema in the syntax of an older schema-schema, you...
        1. would be able to validate schemas using the new schema-schema for correctness (assuming all constraints of interested could be encoded in the schema-schema, which is not the case with things like id uniqueness)
        2. wouldn't be able to use those schemas because we still wouldn't understand the semantics
        3. could assume the schema is essentially the same but just includes modular things like new validations and proceed while ignoring things we don't understand (more on this in a subsequent comment)
  2. ❓Microsoft has a non-standard extension/convention that any JSON that has a top-level "$schema": "https://..." key will be validated against that schema

    1. love this, means weirdo formats can immediately get IDE support if the KDL Language Server supports this
    2. it also supports querying https://www.schemastore.org/json/ based on file name or something? works well for things like package.json or Cargo.toml

@Gankra
Copy link
Author

Gankra commented Nov 6, 2022

On forward-compat / extensibility

(This is future-facing stuff, not necessarily problems in the current spec)

Forward-compat and extensibility are two sides of the same coin: to what extent can things be designed to support New Or Non-Standard Features. However all the defaults of the KDL Schema seem tuned heavily against this. In particular (unlike json-schema) everything defaults to "only what I've mentioned is allowed". I get the inclination but it's pretty hazardous to the goals of this section.

In my prototype I've identified a few places where extensibility could be supported:

  1. If a node has "all validations" in its children set I parse all unknown node kinds as Validation::Unknown(KdlNode) and assume I will provide some kind of hook for custom validators
  2. similarly there is Validation::UnknownFormat(String) for format "???"
  3. I don't currently do this for link rel="???", maybe I should

Features that would be useful for forward-compat:

  1. default values for optional things
    1. makes the schema more like documentation
    2. potentially allows applications to apply the defaults
    3. potentially allows future schemas with more complex rules to "bootstrap" semantics with actual values
  2. ability to declare an enum as open/non-exhaustive
    1. arguably useless in a vacuum (would negate the entire function), but if enum values could include descriptions this would again gain more documentation (IDE tooltips!)
    2. hmm or at this point should you just be using the proper values node? wait no each values entry is just specifying min/max len and then deferring to validations...

@zkat zkat added bug Something isn't working help wanted Extra attention is needed labels Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants