-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Event-Level Click Conversion Measurement API #418
Comments
We started to look at this in our F2F in Tokyo but ran out of time, so we weren't able to start the review. We are noting that this is a competing proposal to The Private Click Measurement proposal. @hober and I will do a deeper dive as soon as we have the chance. |
Thanks! Yeah we tried to align our proposal as much as possible with the Private Click Measurement proposal. We're supportive of trying to land on a unified solution that can be used by all browsers to suit their (potentially differing) needs. This is related to privacycg/private-click-measurement#11. |
This thread looks to be of interest with input from Mozilla and WebKit. Input about possible risk of misuse would be welcome. Would you be so kind and expand the answer to 2.2 (security and privacy health check)? Now it's:
Are you sure about 2.12, i.e. can't these identifiers serve as temporary ID? |
@lknik I updated question 2.2 of the questionnaire to go into a bit more depth on minimum information necessary since there is some nuance there. I also just updated the Privacy Considerations of the explainer to go into more depth (link). Can you elaborate on what precisely you mean by "can't these identifiers can serve as a temporary ID"? |
Sure. Are you sure the information exchanged using this API is not constituting a temporary identifier? |
@lknik I may have misunderstood that question in the questionnaire as the browser creating some new global identifier that is readable across sites and can be used for re-identifying users across the web, which this API does not do. I updated the section, since I suppose you could consider the join of <impression metadata, conversion metadata> as a temporary identifier that just isn't exposed to script and is only sent in non-credentialed requests. Can you take a look and see if it makes sense to you? |
Hi, Thanks for additional clarifications! Will have a look. So
Is "we" here the royal "we" or any particular "we"? ;) More to the point, I still don't quite get it why 64b is exactly needed. I don't want to be overly picky (despite I maybe am a bit), but while I understand your reply to 2.2, I don't get it where those 64b come from. In the meantime a bit more architectural question/comments/remarks/angle. Basically what I wonder is the overlap factor with another existing spec (also implemented already I think) that intends to deal with similar functionality, specifically Ad Click Attribution that defines the following: <a adCampaignId=”[6-bit ad campaign id]" adDestination="[ad click destination URL]"> In Conversion Measurement API defines: <a addestination=”[eTLD+1]” impressiondata=”[string]” impressionexpiry=[unsigned long long] reportingdomain=”[eTLD+1]”> Of course we may end up with sites using the following in practice: <a addestination=”[eTLD+1]” adCampaignId="[6-bit ad campaign id]" impressiondata=”[string]” impressionexpiry=[unsigned long long] reportingdomain=”[eTLD+1]”> ...with some implementations selectively ignoring parts of the attributes. Though here I must thank you for clearly defining what adDestination is (which is apparently not the case with ad click attribution; yes I realise that based on spec text explains that adDestination in both specs translates to same thing). But I wonder if you could not discuss/agree on any convergence in particular. Because at the moment it seems we're exploring two different specs that deal with indeed pretty similar tasks. I am not sure if this kind of enrichment makes web platform a better place. |
Sorry that's the royal we. Labeling in general is not possible unless we can pick the specific inference (i.e. ad selection) and say whether it was a success or failure (or maybe something a little finer grained).
64 exact bits aren't fully needed, but you basically want a scheme that lets you avoid too many collisions (i.e. the scheme is "event-level" where we can pinpoint a specific event like an ad-click and label that). We could probably reduce this down, but as you probably know, anything >= 33 bits can identify an individual on earth, and you likely need a lot less to identify a user on a particular site in a given 2 day window, so we only really start getting meaningful privacy protections if the impression metadata is reduced below 32 bits. However, as soon as you start getting into a regime where your click ids are colliding a lot (you are using the full bit space), utility decreases a lot. In this case in the design we chose 64 bits because we felt moving below 33 would prove tricky utility wise due to collisions, and reducing the 64 bits to some number > 33 didn't really improve privacy at the margin.
Yes I think we should try to align on convergence. I think many of these API differences can be resolved by aligning on API surface, and having some sort of "configuration" advertising the valid inputs. For instance, a UA that wants the impression metadata to be 6 bits can use the same attribute but advertise that they only accept 6 bit input. Similarly, the reportingdomain could only accept e.g. publisher or addestination domains for UAs that want those guarantees. |
So, by that logic, you don't believe your proposal gets meaningful privacy protection. Noted. |
I don't think that's a fair characterization. My statement was about the size of the impression id, i.e. the publisher-side identifier. An identifier on its own, even a high-entropy one, isn't necessarily bad for privacy even if it can be used to identify a user. For instance, the fetch() API can be used to send arbitrarily large identifiers. The privacy sensitive aspect of this API is what information is allowed to be joined with this publisher-side identifier that couldn't before (i.e. what this gives you over using the fetch API). In this case, we allow a noisy, very low entropy (e.g. 3 bit) cross site identifier to be joined, gated on a click and some action on the advertiser site (a conversion). Abstractly, the API introduces something like a rate-limited, low-entropy, noisy message channel from advertisers --> publishers, where messages can only be sent on clicks. The privacy of the API would be improved if the impression side ID were < 32 bits, but I still think the API has meaningful privacy protections. |
@hober It's helpful to be more precise than just saying "privacy", and indeed the list of high-level threats in the Target Privacy Threat Model that PING is working on should give us the language to communicate better here. A lot of this proposal is focused on threat "Unexpected Recognition, cross-site" — that is, on preventing anyone from recognizing the same user across two different sites. We talked about why that was our primary focus in our privacy model explainer. Fixing that problem definitely is "meaningful privacy protection". The impression ID here is deliberately large enough to uniquely identify which ad impression it was that converted, so it also allows a small amount of what the Privacy Threat Model calls "information disclosure". That's the "rate-limited, low-entropy, noisy message channel" that @csharrison described. Putting the browser in control of the rate, entropy, and noise is also a "meaningful privacy protection". And sure, blocking information flow altogether is of course "more private", but it also doesn't solve the problem at hand. |
Thanks a lot for the updates. My following input.
So it seems to me you acknowledge that size (erm...) matters, which is sensible indeed, but on the other hand we're all also aware that much less than 33b is sufficient for tracking. In this case, the 64b is closer to "just a bit more than we really need", or closer to "just right", and if so, why? Sounds a bit arbitrary to some extent still. So if this is an arbitrary choice, why not, say, 60 bits, or 1024 bits, or no limit and leaving it to the browser? Which would mean that for example Safari would have its bit length, and other browsers, maybe other numbers (since it seems both UAs go for different numbers anyway). There is, of course, also the report uri attribute.
Indeed would be great. I don't know, however, how such co-op should start. Is there any progress on starting the conversation somehow, somewhere?
I see your point. Though I actually wonder what is the main point here. Solve cross-site tracking potential? Civilise tracking a bit to give compelling arguments for softer anti-tracker blocking, since you mention above blocking information flows? |
Hi @lknik: In consultation with @johnwilander, we're going to try to align our two proposals through discussion and issues on https://github.com/WICG/ad-click-attribution. |
Hey @lknik , thanks for the response.
I agree this is an arbitrary number. We picked it for two reasons:
60 bits vs 64 bits I believe makes no practical difference to utility or privacy, so we went with the rounder number. |
We discussed it during telecon today. For the moment we propose to close the issue. Should you feel the need, please come back for additional feedback once you progress on the layering of the approach with the PCM one (it will be more meaningful to look it at that point). Thank you, tuning out! |
Hey TAG reviewers,
cc @johnwilander @michaelkleber @johnivdel. Happy to discuss some of these items more in the next privacy CG F2F if it's useful. |
こんにちはTAG!
I'm requesting a TAG review of:
Further details:
We recommend the explainer to be in Markdown. On top of the usual information expected in the explainer, it is strongly recommended to add:
You should also know that...
We’re still very early stage here, just looking to get TAG review earlier rather than later. We also have some nascent ideas in https://github.com/csharrison/conversion-measurement-api/blob/master/AGGREGATE.md but those should be reviewed separately since it’s definitely not ready yet.
We'd prefer the TAG provide feedback as (please select one):
Please preview the issue and check that the links work before submitting. In particular, if anything links to a URL which requires authentication (e.g. Google document), please make sure anyone with the link can access the document.
¹ For background, see our explanation of how to write a good explainer.
The text was updated successfully, but these errors were encountered: