[Draft] Confidentiality Analysis #10

mortendahl · 2019-10-23T10:24:50Z

No description provided.

jvmncs · 2019-10-23T16:03:50Z

20191023-sensitivity-analysis.md

+- Type system based on the built-in tensor types and their sensitivity property.
+- An error is raised if a plaintext tensor is ever found on a player that is *not* in its sensitivity set; this can be checked at compile time and, optionally, at runtime.
+- Subtyping allows for implicitly restricting sensitivity by removing players from the set: `T(S) <: T'(S')` if `S'` is subset of `S`.
+- `tfe.analysis.broaden` must be used to broaden sensitivity by adding players to the set: `broaden_S(x) : T(S union S')` when `x: T(S')`; this makes it syntactically clear to the user where extra attention must be paid; no-op used by the type system, similar to type hints.


I do see some benefits of this level of transparency, but I'm not sure this dynamic casting of the sensitivity fits my mental model. Here's how I think of it:

A tensor (piece of data) comes with a sensitivity set, i.e. that's a static property of the object, & not the class <-- maybe this is the crux, and there are problems you've foreseen with the alternative

When that tensor is consumed by some computation, the resulting output tensor has some new sensitivity set

Computations can be categorized by how the output differs from the sensitivity set of a parent

Arbitrary broadening is unnecessary because transformations of sensitivity sets are implicitly defined by each node (i.e. operation) in the computation graph. This is not to forbid the user from doing so, but just discourages it.

Put another way, should the sensitivity set be a property of each tensor type, or each tensor?

A tensor (piece of data) comes with a sensitivity set, i.e. that's a static property of the object, & not the class <-- maybe this is the crux, and there are problems you've foreseen with the alternative

Maybe we're saying the same thing: I imagine that each tensor instance (and not each tensor type) has its own sensitivity, i.e. sensitivity is an instance member, and depends e.g. on where the tensor was created.

When that tensor is consumed by some computation, the resulting output tensor has some new sensitivity set

I was thinking that most operations do/should not change sensitivity. For high level functionalities (such as secure aggregation) the broadening could be an internal step that doesn't require any additional broadening by the user of these.

Computations can be categorized by how the output differs from the sensitivity set of a parent

Any more thoughts on this?

Arbitrary broadening is unnecessary because transformations of sensitivity sets are implicitly defined by each node (i.e. operation) in the computation graph. This is not to forbid the user from doing so, but just discourages it.

Arbitrary broadening is an important part of the policy, say the fact that aggregation is enough to release otherwise sensitive values. As mentioned above, this policy can be baked into high level functionalities, but for general computations I don't see how we can know upfront what policy the user wants (besides the default of copying).

Put another way, should the sensitivity set be a property of each tensor type, or each tensor?

My thoughts are each tensor. Are we saying the same thing here?

Are we saying the same thing here?

I believe so -- the mention of "subtyping" was what confused me I think, can you clarify what you mean by that?

Any more thoughts on this?

I've just noticed that in e.g. DP there is an allowable query set that maintains the DP bound, and suspect this might mean that operations can be similarly categorized for the kind of sensitivity we're considering here (although perhaps as you say that doesn't account for all the operations we'd want to be able to check).

As mentioned above, this policy can be baked into high level functionalities, but for general computations I don't see how we can know upfront what policy the user wants (besides the default of copying).

Where is this? I'm not sure I see it in the doc -- are you referring to the examples below?

jvmncs · 2019-10-23T16:13:49Z

20191023-sensitivity-analysis.md

+
+For secure aggregation for federated learning we obtain:
+
+0) Model weights with type `PlaintextTensor({mo})`.


Here's an example where my understanding was different, related to above. Any tensor x could equivalently be stated as having type PlaintextTensor(U) where U = union(s in S_t) and S_t is the sensitivity set of x at any time t in the tensor's lifespan. In this case, that means the model weights here would be instantiated sensitivity None (which matches my intuition about the security of the weights in the secure aggregation use case).

I'm sure there are benefits to the approach you specify here over the one I'm describing -- what did you have in mind?

Any tensor x could equivalently be stated as having type PlaintextTensor(U) where U = union(s in S_t) and S_t is the sensitivity set of x at any time t in the tensor's lifespan.

Something like this will happen at runtime when the policy is being checked, but my thoughts were that we need something to check against, ie a way to express expectations.

In this case, that means the model weights here would be instantiated sensitivity None (which matches my intuition about the security of the weights in the secure aggregation use case).

Your intuition is that it is safe to share the weights?

The idea is that when a player creates a tensor we start out by assuming that it is a very sensitive value; if that is not the case then it needs to be specified one way or another.

Something like this will happen at runtime when the policy is being checked, but my thoughts were that we need something to check against, ie a way to express expectations.

Agreed. I was expecting that instantiating a tensor x with a sensitivity set S would enforce at runtime that the value x (or any of its children in the computation graph) would not be broadened beyond S. There might be an exception when a specific operation releases this requirement, e.g. when secure aggregation happens we can release the tensors that result, even though some of the parents in the graph might have a stricter sensitivity set. Specifically, the training data tensors & all children (including the local gradients) would have the stricter sensitivity set.

Your intuition is that it is safe to share the weights?

In the secure aggregation example, it seems necessary for the policy to broaden the sensitivity of the weights to at least include the data owners/clients.

The idea is that when a player creates a tensor we start out by assuming that it is a very sensitive value; if that is not the case then it needs to be specified one way or another.

This makes sense to me, it's a good default in the absence of what I'm describing here and in the other thread. I suppose I'd assumed the existence of that API in my comment above, so will focus attention on that thread.

jvmncs · 2019-10-23T16:16:35Z

20191023-sensitivity-analysis.md

+
+## Detailed Design
+
+## Questions and Discussion Topics


This seems to be defined over abstract notions of Plaintext & Encrypted -- does this mean that sensitivity would apply to e.g. an AdditivelySharedTensor as well as the component shares inside the AdditivelySharedTensor? Or would it just be at the higher level?

Concrete encrypted tensors would be inherited sensitivity from the abstract EncryptedTensor. Did not image that component/backing tensors would have their own sensitivity, although they would probably have their own placement.

In this case, what qualifies these concrete encrypted tensors being in violation of their sensitivity set? It seems like it would have to be semantically different from what it means for plaintext tensors. Encrypted might be something like "is never decrypted by a player outside of the sensitivity set" vs. plaintext might be something like "is never possessed by a player outside the sensitivity set" -- is this correct and intentional?

Also, I think this accounts for the mismatch I describe above -- I was only thinking about sensitivity in the context of the plaintext description "is never possessed by a player outside the sensitivity set" and was thinking that the backing/component tensors would have their own sensitivity, in which case passing them through specific kernels might have more well defined effects on sensitivity sets.

mortendahl added 4 commits October 23, 2019 12:23

init

e74d3c1

..

bcd923d

..

4792aa9

..

903f74f

jvmncs reviewed Oct 23, 2019

View reviewed changes

mortendahl changed the title ~~[Draft] Sensitivity Analysis~~ [Draft] Confidentiality Analysis Oct 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Draft] Confidentiality Analysis #10

[Draft] Confidentiality Analysis #10

mortendahl commented Oct 23, 2019

jvmncs Oct 23, 2019

mortendahl Oct 23, 2019

jvmncs Oct 28, 2019

jvmncs Oct 23, 2019

mortendahl Oct 23, 2019

jvmncs Oct 28, 2019

jvmncs Oct 23, 2019

mortendahl Oct 23, 2019

jvmncs Oct 28, 2019

jvmncs Oct 28, 2019


		For secure aggregation for federated learning we obtain:

		0) Model weights with type `PlaintextTensor({mo})`.

[Draft] Confidentiality Analysis #10

Are you sure you want to change the base?

[Draft] Confidentiality Analysis #10

Conversation

mortendahl commented Oct 23, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment