Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PARAKEET with Noisy Ranking: Advertiser isolation #47

Open
bm371613 opened this issue May 6, 2022 · 4 comments
Open

PARAKEET with Noisy Ranking: Advertiser isolation #47

bm371613 opened this issue May 6, 2022 · 4 comments

Comments

@bm371613
Copy link

bm371613 commented May 6, 2022

An advertiser might not want to provide signals to be potentially used to their competitors’ advantage. To satisfy the advertiser’s needs, the buyer could try to isolate users’ data and compute embeddings separately. During an auction, such isolated advertisers should be considered independently, even if represented by the same buyer.

PARAKEET with Noisy Ranking seems to lack mechanisms to support that, with the following scoped per buyer:

  • User profile
    • Shared storage
    • Browser embeddings
  • User privacy protection mechanisms
    • Throttling
    • Random delays
    • Caching

What do you think about this use case?

@jpfeiffe
Copy link

jpfeiffe commented May 6, 2022

Shared storage representation, while scoped to buyer, allows the buyer to build representations of their choosing. Hence, the buyer (exactly like today) can build representations at the advertiser level, and use those as part of the relevance evaluation within the worklet provided to the TM. For example, the buyer can set the list of advertiser domains, which might be used later for something like boosting bids for advertisers the user has seen previously. What isn't explicit here is that per-advertiser models could be run as part of that worklet at the TM. The work to do this is mostly implementation and API type specifications: all of the data is in place within the TM to make that evaluation possible.

In contrast, the noisy ranking vector sent from the TM to the buyer at request time likely shouldn't be advertiser specific. In theory, a DSP can choose what representation they want, but either they have to choose only a single advertiser to pass through the embedding model, or they have to somehow concatentate a bunch together. In either case, the amount of noise needed for privacy will likely muddy any per advertiser signal, with the exception perhaps of a couple of extremely large cases. A more workable representation is for the more general noised representation to be sent, the DSP then stratifies a variety of advertisers to send back, and let's the TM handle the per advertiser representation cases.

@bm371613
Copy link
Author

I see how the buyer could satisfy the isolation requirement by choosing a single advertiser to pass through the embedding model, stratifying ad selection and finally filtering. This, however, seems inefficient: a niche advertiser in a crowded market would have very few ads in the mix, if any. And even that only if it is selected to be the one passed through the embedding model.

Another solution allowed by PARAKEET with Noisy Ranking would be for a buyer to deploy a private instance of its system for the advertiser, so that it would act as its own buyer. That would solve all the problems related to sharing any quotas with other advertisers. However, it has another flaw: it multiplies the number of embedding models trained on the TM. If the buying system provider deploys a single embedding model for different advertisers’ private instances, they will be trained separately and diverge.

Maybe there is a middle ground? What do you think about allowing buyers to put advertisers into groups, isolating them for all purposes, except for the embedding models? That could satisfy the advertiser isolation requirement without the inefficiency of a stratified ad selection. For privacy considerations, this should not be worse than a DSP deploying a private instance of their service for a subset of advertisers under a new domain.

@jpfeiffe
Copy link

Thought we were going to cover this in the meeting, but apparently was missed.

Ideally, and the plan would be, the training / evaluation module is robust enough to support this as is, if the buyer wants to. E.g., you have K different "submodels" that run for different subgroups of advertisers (similar to the example which has 2 models but for different feature sets). When you train, the script identifies which submodel should be triggered for a sample based on say some addomain, and so the loss is only computed on that part of the sample. When you evaluate, the evaluator returns the vectors from all the models, say concatenated together to the browserEmbedding. The only real constraint here is that this final vector will have to have its norm clipped for privacy. The adtech is free to select / rank however it wants then.

A hacked-together class might look something like this:

class Network(nn.Module):
  def __init__(self, submodels, subargs):
    super().__init__()

   self.submodels = [Submodel(i, subargs) for i in range(submodels)]

  def forward(self, istrain, data):
    if istrain:
        return self.submodels[data["submodel"]](data)

   return torch.cat([sb(data) for sb in self.submodels])

@bm371613
Copy link
Author

bm371613 commented Jun 21, 2022

If I understand correctly, this is how the solutions mentioned so far compare

  A. Stratification & filtering B. Private DSP deployment C. Advertisers separated, but sharing a model D. Model concatenation
Currently supported yes yes no yes
Ad selection embedding size small small small Big. Grows with # of advertisers. A small part is relevant for a given advertiser. Clipping penalty for a given advertiser grows with # of advertisers.
Do advertisers compete for space in selection results based on noisy embeddings? yes no no yes
Model Single, small Multiple small models Single, small Multiple small concatenated into a big one. Additional challenge: managing the concatenated model as advertisers come and go.
Do advertisers share throttling and caching? yes no no yes

If this is accurate, D does not seem to be an improvement over B. Is this comparison fair, or have I missed something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants