Release R136 #1217

dilyand · 2022-08-04T09:46:04Z

No description provided.

istreeter

Hey @dilyand please can you explain more about what you are trying to achieve with this schema change, and therefore why you chose these values for maxLength?

We spoke earlier about this being "the schema change to end all schema changes" -- meaning we're all fed up of updating enrich each time the yauaa enrichment sees an unexpected user agent string. So we talked about lengthening each field to protect it from future changes.

But with that in mind, I can see you have chosen a mixture of 256, 1000, and 1024 as max length for different fields. How did you decide on these values?

istreeter · 2022-08-04T16:19:25Z

schemas/nl.basjes/yauaa_context/jsonschema/1-0-4

+        },
+        "deviceName": {
+            "description": "Example: Google Nexus 6",
+            "type": ["string", "null"],


Why the change from type string to type [string, null]? I'm not against it if you have a good reason. It struck me as odd because enrich doesn't output nulls.

They are optional fields, so presumably at some point they may be sent with a null value. We even have it as a check in igluctl.

I appreciate we currently do not do that in enrich, but if we change it in the future, we won't need to worry about the schema not allowing it. It's part of trying to make the schema more future-proof.

My gut feeling at the moment is like this:

If you're going to allow nulls for some fields, then you need to allow nulls for all fields, or else it's not consistent between fields. Which means allow nulls for the enum fields too, like operatingSystemClass. Which means add null to the enum array.

If you're adding null to the enum array then you need to be very very sure that RDB Loader will not choke on the schema migration. Remember that RDB Loader immediately (and probably wrongly) tries to migrate the table as soon as the schema is published; it does not wait until it sees the first 1-0-4 event. Remember also, we need to be sure that old versions of RDB Loader also don't choke on the schema migration, otherwise we might break oss pipelines just by publishing a schema.

Now RDB Loader probably handles this OK. I'm just saying we need to be 100% sure before merging this.

For that reason.... personally I would avoid adding the nulls, and put it back to be more like how it was in 1-0-3. There is no reason whatsoever to add nulls to this schema. I see no reason why enrich will ever change in future and to start emitting nulls.

Yes we want to make the schema future-proof, but I see it as future-proofing against things we are not in control of, like crazy user agent strings. Whereas nulls are something we are completely in control of, so we don't need to future-proof against nulls.

I think that is a sensible idea. I was trying to make igluctl happy, but I agree that in this instance, where we're in control of this, there is no need to worry.

istreeter · 2022-08-04T16:19:31Z

schemas/nl.basjes/yauaa_context/jsonschema/1-0-4

+        },
+        "operatingSystemClass": {
+            "description": "See https://yauaa.basjes.nl/README-Output.html",
+            "type": ["string", "null"],


You added "null" as a type... but this does not make any difference, unless you also add null to the enum array. As currently written, null is still not an allowed value for operatingSystemClass.

Yes, I did not realise that. I think if we keep null as an allowed type, I'll add it to the enum.

dilyand · 2022-08-05T09:21:40Z

Hey @dilyand please can you explain more about what you are trying to achieve with this schema change, and therefore why you chose these values for maxLength?

We spoke earlier about this being "the schema change to end all schema changes" -- meaning we're all fed up of updating enrich each time the yauaa enrichment sees an unexpected user agent string. So we talked about lengthening each field to protect it from future changes.

But with that in mind, I can see you have chosen a mixture of 256, 1000, and 1024 as max length for different fields. How did you decide on these values?

I put most of the reasoning in the issue #1216, but very briefly:

All fields that are extracted from the useragent string have a limit of 1000 because this is the max length for the useragent field in atomic.events. So, even if the whole string is interpreted as a single property, it still can't exceed that value.
All fields that are deduced (rather than extracted) have a limit of 128 / 256. This somewhat arbitrary. I just wanted to increase the existing limits which seemed a bit too strict. My thinking here is that these values are controlled by the YAUAA library, not by whoever puts together the useragent strings, and so are less likely to fluctuate widely.
The fields with length 1024 were already like that. I do not want to lower this limit because that is not a backward-compatible change. But in practice, we should never have anything higher than 1000 because of the limitation for the atomic useragent field.

istreeter

Looks good.

istreeter · 2022-08-08T17:51:50Z

schemas/nl.basjes/yauaa_context/jsonschema/1-0-4

@@ -0,0 +1,234 @@
+{
+    "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
+    "description": "Schema for an entity context generated by the YAUAA enrichment after parsing the user agent",


Entity context or context entity?

"context entity" feels more right to me. I'd probably just go for "entity" though.

I changed it back to just 'context'. I think the word entity confuses things in this case. This schema does not describe an entity, but rather just a grab bag of things that could be deduced from the useragent.

oguzhanunlu · 2022-08-16T13:04:20Z

I opened a PR in enrich to upgrade yauaa from 5.23 to 7.4.0

yauaa 7.4.0 has WebviewAppNameVersion as well (since v6.4), it is not documented here

is there a specific reason we didn't include webviewAppNameVersion field in this new version of the schema?

oguzhanunlu · 2022-08-17T11:51:33Z

I added a separate commit to add webviewAppNameVersion cc @istreeter @dilyand

adatzer · 2022-08-19T14:05:37Z

Could we also merge #1221 into r136? cc @oguzhanunlu @dilyand

istreeter · 2022-08-23T13:45:21Z

I'm happy with the amendment made to this PR since I last approved it.

paulboocock

LGTM. Couple of nit pick comments but happy for this to go out once the changelog is tidied.

paulboocock · 2022-08-24T14:51:15Z

CHANGELOG

+Add nl.basjes/yauaa_context/jsonschema/1-0-4 (#1216)
+Extend copyright to 2022 (close #1218)
+Update links in Readme (close #1219)
+Add CONTRIBUTING.md (close #1220)


nit: I don't think we usually have the word close in these lines

paulboocock · 2022-08-24T14:54:42Z

schemas/nl.basjes/yauaa_context/jsonschema/1-0-4

@@ -0,0 +1,234 @@
+{
+    "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#",
+    "description": "Schema for an entity context generated by the YAUAA enrichment after parsing the user agent",


"context entity" feels more right to me. I'd probably just go for "entity" though.

matus-tomlein · 2022-09-01T09:49:26Z

Added a new version of the remote_config schema reviewed in #1226, also updated the changelog to include that, cc @paulboocock

CONTRIBUTING.md

README.md

jbeemster · 2022-09-01T13:42:05Z

schemas/com.snowplowanalytics.mobile/remote_config/jsonschema/1-0-1

@@ -0,0 +1,225 @@
+{


@istreeter doing a diff with 1-0-0 a lot of these fields now have a maximum as well as a few now allowing object, null pairing over just object. Just want to confirm that the addition of new limits and type groupings is not going to cause issues in any of our loaders / data-warehouses?

I am not entirely certain if this schema gets tracked into pipelines to be fair but if it was would these changes get automatically represented in their respective columns?

The question is still valid, just wanted to say that the schema is not being used in pipelines, it is just to help developers better define their configurations (some IDEs may use it for validating the JSONs).

I tested this on a pipeline with a loader. Even if it's not expected to be used in tracking, it was worth testing.

It is completely safe, because the field configurationBundle is a complex object, so RDB transformer simply stringifies the whole field, and RDB Loader loads it as a string. Any change you make to a sub-field (e.g. configurationBundle.namespace) does not bother the transformer/loader at all.

jbeemster · 2022-09-01T13:44:43Z

schemas/nl.basjes/yauaa_context/jsonschema/1-0-4

@@ -0,0 +1,238 @@
+{


Same question again actually @istreeter - do all of the columns get automatically widened when maxLength is changed or say a longer enum value is added?

WARNING -- this migration did not work when I tested in with Redshift RDB Loader. The new field got added to the table, but the existing columns were not updated for the new lengths.

I opened this issue in rdb loader to investigate it. I spent quite a while trying to pinpoint the error, but no luck yet.

I'm not comfortable with merging this release until we have found why RDB Loader did not alter the lengths as it should.

istreeter · 2022-09-07T16:55:24Z

schemas/com.snowplowanalytics.mobile/remote_config/jsonschema/1-0-1

+              },
+              "method": {
+                "description": "The method used to send the requests (GET or POST).",
+                "type": ["string", "null"],


Hey @matus-tomlein just a warning, the "null" on this line does not do what you think it does! If you want this field to be nullable then you must also add null to the enum:

"enum": ["get", "post", null]

Otherwise, a null will get rejected during validation.

I know this is slightly surprising. You can test it out though using an online validator.

Thanks @istreeter, didn't know about this! Fixed.

paulboocock

Technically I have no issues with the schemas, but there's a few nitpicks and a few questions on if these are the schemas we want to be commiting to for whatever use case they are being used to build (especially object 🤮)

schemas/io.snowplow.foundation/content/jsonschema/1-0-0

paulboocock · 2022-09-13T12:34:44Z

schemas/io.snowplow.foundation/content/jsonschema/1-0-0

+    "name"
+  ],
+  "self": {
+    "vendor": "io.snowplow.foundation",


What does foundation represent/mean? Are these not the recipe schemas from Try Snowplow? Why not io.snowplow.recipe or a final part that represents the "usecase" we typically see this schema used in io.snowplow.content?

I was OK with the name "foundation" for the vendor, meaning like a collection of schemas that can be used when first starting off with Snowplow, before graduating to more advanced use cases when the user would want to design their own bespoke schemas.

But... I find the names of the schemas a bit strange. The names "content" and "conversion" don't mean anything to me, but maybe that's because I don't come from the world of web analytics.

I don't have any problem with the schemas though, so I'd be OK to approve this release.

foundation was the group consensus for what Ian describes

we didn't want to go with use-case specific names in case we later want to add more advanced content / conversion / funnel schemas that would then clash with these

I'd be happy with recipe too, I don't who that decision is ultimately up to

Sorry to be that guy...

Now that I understand what foundation means, as in foundational... I think I prefer it over receipes as its more generic and allows these schemas to be used in a more broader context than just the recipe context.

schemas/io.snowplow.foundation/object/jsonschema/1-0-0

jbeemster

LGTM - can I get a green-light when you lot get online and ill go ahead and merge it!

…#1225)

…1230)

snowplowcla added the cla:no label Aug 4, 2022

dilyand marked this pull request as draft August 4, 2022 09:46

dilyand force-pushed the release/r136 branch from a1d6014 to 8561cce Compare August 4, 2022 10:42

dilyand marked this pull request as ready for review August 4, 2022 10:43

dilyand requested review from istreeter and a team August 4, 2022 10:43

istreeter reviewed Aug 4, 2022

View reviewed changes

dilyand requested a review from istreeter August 8, 2022 08:16

dilyand force-pushed the release/r136 branch from 8561cce to 996544f Compare August 8, 2022 15:46

istreeter approved these changes Aug 8, 2022

View reviewed changes

dilyand requested a review from a team August 9, 2022 13:15

adatzer mentioned this pull request Aug 16, 2022

Issues/updates #1221

Merged

dilyand force-pushed the release/r136 branch 4 times, most recently from b440d28 to dd6a49f Compare August 23, 2022 14:37

paulboocock approved these changes Aug 24, 2022

View reviewed changes

dilyand force-pushed the release/r136 branch from dd6a49f to db44e7a Compare August 24, 2022 15:08

dilyand requested a review from paulboocock August 24, 2022 15:10

paulboocock removed the cla:no label Sep 1, 2022

snowplow deleted a comment from snowplowcla Sep 1, 2022

matus-tomlein force-pushed the release/r136 branch from db44e7a to d553671 Compare September 1, 2022 09:48

matus-tomlein mentioned this pull request Sep 1, 2022

Add com.snowplowanalytics.mobile/remote_config/jsonschema/1-0-1 (close #1225) #1226

Closed

paulboocock approved these changes Sep 1, 2022

View reviewed changes

paulboocock requested a review from jbeemster September 1, 2022 10:20

jbeemster reviewed Sep 1, 2022

View reviewed changes

adatzer force-pushed the release/r136 branch from d553671 to e8cb2aa Compare September 2, 2022 09:44

istreeter mentioned this pull request Sep 7, 2022

Redshift loader: Pre-transaction migrations did not run snowplow/snowplow-rdb-loader#1051

Closed

istreeter reviewed Sep 7, 2022

View reviewed changes

matus-tomlein force-pushed the release/r136 branch from e8cb2aa to 2f88aeb Compare September 8, 2022 07:29

adatzer added 3 commits September 13, 2022 13:29

Add CONTRIBUTING.md (close #1220)

6b6fdec

Update links in Readme (close #1219)

796d033

jbeemster force-pushed the release/r136 branch from 2f88aeb to 56c4f3f Compare September 13, 2022 11:39

jbeemster requested review from istreeter, jbeemster and paulboocock September 13, 2022 11:40

paulboocock reviewed Sep 13, 2022

View reviewed changes

cksnp force-pushed the release/r136 branch from 56c4f3f to eacdd96 Compare September 15, 2022 12:41

jbeemster approved these changes Sep 19, 2022

View reviewed changes

matus-tomlein and others added 5 commits September 19, 2022 10:42

Add com.snowplowanalytics.mobile/remote_config/jsonschema/1-0-1 (close …

6bcd4fd

…#1225)

Add io.snowplow.foundation/content/jsonschema/1-0-0 (close #1228)

5674699

Add io.snowplow.foundation/conversion/jsonschema/1-0-0 (close #1229)

e44d1f4

Add io.snowplow.foundation/funnel_interaction/jsonschema/1-0-0 (close #…

ddc3e95

…1230)

Prepare for R136 release

2ec9d57

cksnp force-pushed the release/r136 branch from eacdd96 to 2ec9d57 Compare September 19, 2022 07:45

jbeemster approved these changes Sep 20, 2022

View reviewed changes

jbeemster merged commit cb00a49 into master Sep 20, 2022

istreeter mentioned this pull request Oct 24, 2022

Release R137 #1238

Merged

cksnp deleted the release/r136 branch February 21, 2024 14:26

Release R136 #1217

Release R136 #1217

Conversation

dilyand commented Aug 4, 2022

istreeter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dilyand commented Aug 5, 2022

istreeter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oguzhanunlu commented Aug 16, 2022 • edited Loading

oguzhanunlu commented Aug 17, 2022

adatzer commented Aug 19, 2022

istreeter commented Aug 23, 2022

paulboocock left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matus-tomlein commented Sep 1, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulboocock left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbeemster left a comment

Choose a reason for hiding this comment

oguzhanunlu commented Aug 16, 2022 •

edited

Loading

paulboocock left a comment •

edited

Loading