Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discard User Agent if request has SP-Anonymous header #111

Closed
mbehm opened this issue Jan 27, 2021 · 3 comments
Closed

Discard User Agent if request has SP-Anonymous header #111

mbehm opened this issue Jan 27, 2021 · 3 comments

Comments

@mbehm
Copy link

mbehm commented Jan 27, 2021

In relation to #90 and #94 as far as I know User Agent strings are considered PII data under GDPR similar to IP addresses and should be discarded for anonymous tracking.

@paulboocock
Copy link
Contributor

Hi @mbehm

User Agent strings alone are not generally considered PII, they typically only might be if they form part of a fingerprint but your Snowplow pipeline doesn't fingerprint users based on a User Agent (or fingerprint them at all by default).

I'd suggest running the PII Pseudonymization enrichment to hash the User Agent if you don't want it stored in your DB in its raw form.

We'll also start seeing the UA string become less useful over the coming months/years as the browser vendors look to freeze it, which takes it further away from being PII and a fingerprint vector.

You have sparked an idea though that I'll consider further. I think there is an opportunity to another enrichment in the Snowplow pipeline that allows for fields to be removed based on the SP-Anonymous header.

@mbehm
Copy link
Author

mbehm commented Jan 27, 2021

Thank you for the fast response.

Yes we're already running the PII enrichment to hash the User Agent and I'm aware that it's being migrated out off. Still I'd prefer to just drop it all together with anonymous tracking before it hits the event queues same way as IP address (which by themselves aren't PII either as far as I know).

Regardless an enrichement to conditially drop fields based on SP-Anonymous would indeed be a very useful addition.

@paulboocock
Copy link
Contributor

Yeah, I think there are some additional things that can be considered here - I also see how a User Agent string can be considered PII in some use cases.

This has opened up some additional thinking and I can see how making this configurable in the collector would be beneficial, even more beneficial than the enrichment idea. I've opened an issue to track that configurable concept in favour of reopening this one (#112).

Thanks for your feedback and idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants