Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider more specialized support for tagged union types #104

Open
popematt opened this issue Dec 22, 2022 · 0 comments
Open

Consider more specialized support for tagged union types #104

popematt opened this issue Dec 22, 2022 · 0 comments
Labels
enhancement New feature or enhancement for the Ion Schema _specification_ requires new version Something that should be considered for next version of the Ion Schema Specification

Comments

@popematt
Copy link
Contributor

Ion Schema generally has support for union types by using the any_of and one_of constraints. Right now, a tagged union (where the tag is the distinguishing factor) can be modeled like this:

type::{
  name: a_or_b_or_c,
  one_of: [a, b, c],
}

type::{
  name: a,
  type: struct,
  annotations: closed::required::[a].
  // ...
}

type::{
  name: b,
  type: struct,
  annotations: closed::required::[b].
  // ...
}

type::{
  name: c,
  type: struct,
  annotations: closed::required::[c].
  // ...
}

When a value doesn't match any variant of the tagged union, the validation error reporting is excessive and hard to read. The violations will describe exactly how the value didn't match every single type in the union. In addition, the Ion Schema implementation must check the value against every possible type in the list, which can become expensive when there are a lot of types. The current syntax functions essentially like this (pseudocode):

let violations = [];
types.forEach(it -> 
    let result = it.validate(value))
    if !result.isValid() {
        violations.add(result)
    }
);
return violations;

With a tagged union, we could be smarter about it. If we could identify which of the variants need to be matched based on the tag, then we could test only that variant and provide violations for only that variant. In addition, we only have to check for n possible annotations and the validate against only 1 type, which is cheaper than validating against all of the types. Essentially it would be something like this (pseudocode):

return when {
    value.hasAnnotation("a") -> types["a"].validate(value),
    value.hasAnnotation("b") -> types["b"].validate(value),
    value.hasAnnotation("c") -> types["c"].validate(value),
};

Here's a straw-man syntax proposal:

We could introduce a tagged modifier to the logic constraints that take a list of types. When using tagged, instead of a list, the constraint accepts a struct.

type::{
  name: a_or_b_or_c,
  one_of: tagged::{
    a: {
      type: struct,
      // ...
    },
    b: {
      type: struct,
      // ...
    },
    c: {
      type: struct,
      // ...
    },
  }
}

Another alternative would be to keep the list, and add tagged sexp around the types in the list. This has the benefit of making it easy to specify what should happen for an untagged value. For example:

// Assuming we already have definitions for markdown_document and pdf_document
type::{
  name: resume,
  one_of: tagged::[
    pdf::( $null_or::pdf_document ), // Because of some arcane business requirement, pdfs can be null
    txt::( { type: string, utf8_byte_length: range::[0, 10000] } ),
    md::( markdown_document ),
    ( $null )
  ]
}

In this example, if the value is annotated with pdf, it must be null or a value that matches the pdf_document type. If the value is annotated with txt, it must be a string that is no more than 10000 bytes when encoded as utf8. If the value is annotated with md it must be a valid markdown_document. If the value is un-annotated, it must be null.

This GitHub issue is far from comprehensive. Open questions include things such as:

  • What if a value is tagged with more than one of the listed annotations? (Presumably one_of would reject it as invalid, but maybe that would depend on whether it also matches the tagged types? E.g. what if a value was annotated with a and b, but it was valid for a_type and invalid for b_type?)
  • Should this apply to all of the sum-type (one_of, any_of) and product (all_of) constraints, or should it only apply to the sum-type constraints? Or should it only apply to one_of?
@popematt popematt added the enhancement New feature or enhancement for the Ion Schema _specification_ label Dec 22, 2022
@popematt popematt added the requires new version Something that should be considered for next version of the Ion Schema Specification label May 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or enhancement for the Ion Schema _specification_ requires new version Something that should be considered for next version of the Ion Schema Specification
Projects
None yet
Development

No branches or pull requests

1 participant