Skip to content

Commit

Permalink
Moremore schema migration prose.
Browse files Browse the repository at this point in the history
I think this is as much as I can write about this for now.

Regarding that last bit about not having total migration magic:
I'd certainly be neato to offer more auto-migration tools, based on
perhaps a "patch"ing approach as outlined in
ipld/js-ipld#66 (comment) ,
or on generalized recursion schemes, or a combination.
However... that's a tad downstream of the present ;)

Signed-off-by: Eric Myhre <[email protected]>
  • Loading branch information
warpfork committed Feb 12, 2019
1 parent deeeacc commit 94cf517
Showing 1 changed file with 33 additions and 0 deletions.
33 changes: 33 additions & 0 deletions doc/schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -244,3 +244,36 @@ other forms of versioning; it's essentially the same as using explicit labels.

### Actually Migrating!

... Okay, this was a little bit of bait-and-switch.
IPLD Schemas aren't completely magic.

Some part of migration is inevitably left up to application logic.
Almost by definition, "a process to map data into the format of data we want"
is at its most general going to be a turing-complete operation.

However, IPLD can still help: the relationship between the Data Model versus
the Schema provides a foundation for writing maintainable migrations.

Any migration logic can be expressed as a function from `Node` to `Node`.
These nodes may each be checking Schema validity -- against two different
schemas! -- but the code for transposing data from one node to the other
can operate entirely within Data Model. The result is the ability to write
code that's effectively handling multiple disjoin type systems... without
any real issues.

Thus, a valid strategy for longlived application design is to handle each
major change to a schema by copying/forking the current one; keeping it
around for use as a recognizer for old versions of data; and writing a
quick function that can flip data from the old schema format to the new one.
When parsing data, try the newer schema first; if it's rejected, try the old
one, and use the migration function as necessary.

If you're using codegen based on the schema, note that you'll probably only
need to use codegen for the most recent / most preferred version of the schema.
(This is a good thing! We wouldn't want tons of generated code per version
to start stacking up in our repos.)
Parsing of data for other versions can be handled by ipldcbor.Node or other
such implementations which are optimized for handling serial data; the
migration function is a natural place to build the codegenerated native typed
Nodes, and so each half of the process can easily use the Node implementation
that is best suited.

0 comments on commit 94cf517

Please sign in to comment.