-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report snowplow-specific metrics #178
Comments
You could try and infer data-input type possibly - perhaps with some pattern matching. I think you could have three different Snowplow inputs generally:
One question from me though is what would be the cost implications of parsing every inbound event to extract the timestamp? |
I have the same concern - hopefully it'd be minimal since the analytics we constructed the analytics SDK in such a way as we can retrieve individual fields without processing the entire event. (Filters operate this way and are relatively efficient). But yes I'd want to keep an eye on it.
Decoding thrift for the sake of grabbing the collector tstamp seems like overkill. And we don't have a use case for stream replicator-ing bad data at the moment. So my suggestion here would be to worry about enriched, and wait for requirements for other formats to surface themselves if they exist. |
Even though the app is data-agnostic, there's a good argument that we should still report snowplow-specific metrics. Our usage of the app is snowplow-specific, and reporting latency from collector to target is valuable.
I think we should consider how to fit this into the design and see if we can accommodate it. Perhaps some setting that specifies that it's Snowplow data and grabs collector tstamp for metrics reporting purposes.
The text was updated successfully, but these errors were encountered: