You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This process is lossy and there are circumstances where the revision and addition components of the schemaver are important e.g., determining whether data is backwards compatible when running an aggregation, filtering/dropping on specific schema versions etc. Should the payload be restructured in include the schema version information (or more widely the schema information available to Redshift). Thoughts @chuwy ?
The text was updated successfully, but these errors were encountered:
I think argument about information loss is quite strong. I'd like to preserve revision and addition as long as possible.
There's a function in Scala SDK (not in Python yet), called transformWithInventory which basically extracts set of Iglu keys along with transformed JSON result. Column names are still same (version-lossy), but there's a good chance you can use information about shred types in something like Spark to identify Schema-compatibility issues. Do you think it can be a solution for use cases you mentioned?
I think that makes sense though this would be a subset of that such that the transformation of the input line would yield the Iglu version in the output something like:
The current JSON contexts shredding results in a simplified payload that results in a context name that includes the model along with the data e.g.,
https://github.com/snowplow/snowplow-python-analytics-sdk/blob/master/snowplow_analytics_sdk/json_shredder.py#L102
This process is lossy and there are circumstances where the revision and addition components of the schemaver are important e.g., determining whether data is backwards compatible when running an aggregation, filtering/dropping on specific schema versions etc. Should the payload be restructured in include the schema version information (or more widely the schema information available to Redshift). Thoughts @chuwy ?
The text was updated successfully, but these errors were encountered: