feat: make labels erasable #54

TWal · 2024-09-01T22:19:19Z

This PR adds the [@@erasable] to the label type…
… and updates the rest of the code to reflect this change :)

Because labels are erasable, they can't appear in the Rand constructor of bytes on which we want decidable equality. As a consequence, we now have a challenge to define the get_label function on the Rand constructor that previously contained the label, and that now only contain the time at which the randomness was generated (which correspond to a RandGen event in the trace, that do contain the label).

There are two approaches to this problem:

in feat: make labels erasable #52, we keep the arguments of get_label as-is, but labels become more complicated with indirections (for labels of Rand)
in feat: make labels erasable #54 (this PR), we add a trace to the arguments of get_label and keep labels as-is (without indirections)

Now that get_label depends on the trace, we need new properties such as "the output of get_label dosen't change when the trace grows".
Here we get to a technical difficulty: what does get_label answer on a Rand constructor that do not correspond to a RandGen event in the trace? For example, if the time of random generation is in the future (with respect to that trace)? Then surely the output of get_label may change when the random bytestring is actually generated.

To deal with this scenario, we define a new well-formed check (bytes_well_formed) that checks whether all Rands inside a bytes correspond to a RandGen event in the trace. This allows us to define the theorem get_label_later stating

val get_label_later:
  {|crypto_usages|} ->
  tr1:trace -> tr2:trace ->
  b:bytes ->
  Lemma
  (requires
    bytes_well_formed tr1 b /\
    tr1 <$ tr2
  )
  (ensures get_label tr1 b == get_label tr2 b)

Now the question: how do we prove the precondition bytes_well_formed tr1 b? In most cases, we can rely on the theorem

val bytes_invariant_implies_well_formed:
  {|crypto_invariants|} ->
  tr:trace -> msg:bytes ->
  Lemma
  (requires bytes_invariant tr msg)
  (ensures bytes_well_formed tr msg)

The only place where we can't rely on bytes_invariant is when we define the bytes_invariant, namely in the crypto_predicates, more precisely in the pred_later lemmas. In that case, all cryptographic functions come with lemmas stating how they preserved the well-formedness, although I think in most scenarios we will just need the ones on split and concat that will be handled by Comparse (as it is done in the NSL example).

Although it looks like a big change, the examples were easy to update and proofs mostly didn't break.

Digression on `has_label`

I spent quite some time thinking about a third option, which is using a predicate val has_label: trace -> bytes -> label -> prop (which would be monotonic in the trace)
It looks convenient because we could say that a malformed bytes has no labels.

However, the difficulty is the following: how do we express that the plaintext must flow to the key (e.g. in AEAD)?
Before, we wrote

(get_label msg) `can_flow tr` (get_label key)

Now we could write

forall msg_label key_label.
  has_label tr msg msg_label /\
  has_label tr key key_label ==>
  msg_label `can_flow tr` key_label

However if key is malformed, the precondition is never realized (because key has no label) so this would be true,
but that might become false when the trace grows (and key becomes well-formed).

So instead we could write

exists msg_label key_label.
  has_label tr msg msg_label /\
  has_label tr key key_label /\
  msg_label `can_flow tr` key_label

But we need to prove that key has a label, which brings back the well-formedness problem.

Hence it seems that we have to deal with well-formedness when using has_label too, so I don't think it's worth doing (as it would be strictly more annoying to use than what is in this PR).

TWal · 2024-09-02T07:01:26Z

(asking @qaphla for review since we discussed that in Stuttgart)

qaphla · 2024-09-02T18:14:46Z

Some preliminary thoughts, before I go through it more deeply:

I like the general idea (as we discussed), and this seems like an elegant way to do it.
get_label returning public for ill-formed bytes seems a bit suspicious (but I can't see any concrete example where it causes problems right now) --- my feeling is that we should have something like weak_get_label : trace -> bytes -> option label and then get_label : (tr:trace) -> (b:bytes{bytes_well_formed tr b}) -> label
Relatedly, it seems like bytes_well_formed (and some other related predicates that depend on the trace) might be better presented as a kind of three-valued predicate, being true, false, or indeterminate. In particular, for bytes_well_formed, this lets us distinguish between bytes that are definitely not well formed with a given trace (i.e., they are incompatible with that trace), and ones that may become well-formed with an extension of the trace (compatible but not necessarily so). I'm again not sure right now of an immediate application of this, but with the (partial) ordering where indeterminate is less than both true and false, this is still monotonic with respect to trace growth. The has_label predicate also has this property, if I'm not mistaken. In practice, I think for bytes, we only really want to work with ones that are well-formed with respect to the trace at the moment, but I could envision some kinds of predicates with this property being useful if we want to talk about future events. Of course, the problem is that F* probably doesn't have very nice default handling of this kind of three-valued logic... Then again, it's essentially the same as option bool, so maybe it would work with the monad handling.
Passing thought: It seems like most of this could also be applied to usages in a similar manner. Do you see any obvious obstacle to that?

I'll go over it more carefully soon, and also see if I come up with any better comments on these points, but I think only the second one (get_label returning public for ill-formed cases) needs to be addressed in some form for me to approve this.

TWal · 2024-09-03T09:54:13Z

Thank you for your comments!

About get_label returning public when we don't know the label.
I agree it may seem fishy! We can actually change this to any label, I chose public with the idea that we can't do anything useful with public bytes (e.g. if it is used as an AEAD key, encrypting secrets to it). But on the other hand it means we can send it on the network, so it might also make sense to choose the secret label. But in that case we can encrypt secrets to it.
I don't know what is the choice that looks the least fishy. Anyway, in practice we cannot use it to encryption nor send it on the network, because to do that it needs to satisfy the bytes_invariant which prevents these cases.

About get_label coming with two flavors -- one with pre-condition, the other returning option label.
I agree this would avoid needing to choose a default label (which is currently public), but this would also make the user code more aware of this technical subtlety, for example cryptographic predicates would have to deal with the None case (because they can't prove the precondition). Having a default case do seem more hacky but I think it's more nice to use when doing actual security proofs.

About bytes_invariant being three-valued.
Indeed the combo bytes_invariant + bytes_well_formed could be compressed in a three-valued predicate (that is monotonic in the trace), however there is the subtlety that bytes_invariant needs an instance of {|crypto_invariants|}, whereas bytes_well_formed do not need such things. It's important because bytes_well_formed is used when defining crypto_invariants (e.g. in the pred_later lemmas of crypto_predicates).

About doing the same thing for get_usage.
I guess we could do the same thing at it would work. The benefits would be that usage and bytes wouldn't need to be mutually recursive anymore, and we wouldn't need decidable equality on usage, which could be use for further improvements of usage (although I don't have ideas for that yet).
However I have other ideas for usage, because the design constraints are not the same as with labels, and a has_usage predicate would make sense as we don't need to lift a can_flow relation as we do with labels. Although we could apply the same recipe as the one for labels, there is no technical reason we would need bytes_well_formed for usages.
Giving a few spoilers, I think this has_usage tr msg usg would replace get_usage msg == usg \/ is_publishable tr msg, which I think is the good pre-condition we need in many theorems. But I have not yet tried to see whether this actually works well.

qaphla · 2024-09-03T14:39:51Z

On get_label returning public for ill-formed bytes:
I see a bit better now where the problem is, I think. This feels a little bit like where get_usage returns NoUsage, and maybe the thing to do is to extend labels with something similar (which can neither flow to nor from anything else?). However, that doesn't seem nicely compatible with #55 , unless I'm missing some clever way to design such a predicate.
Maybe the three-valued approach can be used to resolve this in some sufficiently hidden way (e.g. using the typeclass approach of #51 so that instead of explicitly getting labels, we always use less_secret tr, and can internally handle the ill-formed cases), but this might be overengineered.

I think public is in some sense the "least fishy" option we have of the existing labels, and can see now the argument for why it won't cause problems because of bytes_invariant, but would still like to ensure that we keep track of it as a suspicious point.

On three-valued logic:
I hadn't thought of combining both bytes_invariant and bytes_well_formed, but rather using both separately in this three-valued form (and similarly for other trace-indexed predicates like can_flow tr). A subtlety I noticed, though, is that we seem to want different orderings for different predicates --- bytes_well_formed works with indeterminate less than both true and false, and true and false incomparable to each other, but for bytes_invariant or can_flow, or most of the other predicates we work with, we seem to want something closer to the existing stability --- indeterminate < false < true.
I think because of this, it's probably not worth digging too much into right now, but I do think there's interesting potential there if we start to work with more predicates that are similar to bytes_well_formed, or, for that matter, with monotonic functions of the same form (I guess weak_get_label like I proposed earlier would be one such function --- it can go from None to Some x with an increase in trace argument, but can never go the other direction, or from Some x to Some y. My feeling is that this might come up again when we start to look at dynamic labels, but we can see in the future.

On usages:
That sounds good to me --- I do think overall that we should make usages eraseable, much like labels, but I can see how it could be done more simply than labels, simply because it's not tangled in as much of the rest of the core.

Overall:

I approve of this PR --- I'll submit that
If there's no further discussion on get_label, I'll make an issue about it to hold future discussion (so it's not holding up this PR)
I'll probably also make another issue (or something else --- I need to look more at what github offers for discussions) about the three-valued logic or, more generally, monotone functions on the trace, and if we can reuse some of the Recalling a Witness-style reasoning in the same way as they do with stable predicates, to get any useful results.

qaphla

See discussion for more details.

Minor complaint about get_label returning public for ill-formed bytes (to become an issue)
Discussion sparked about three-valued logic/monotone functions for a "weak" version of get_label that avoids the need for a special case for ill-formed bytes.
I look forward to seeing usages get similar treatment.

TWal · 2024-09-04T13:53:03Z

Thanks! We could make an interface file like DY.Core.Bytes.AnyLabel.fsti that would contain

val any_label: label

and use that in get_label for the ill-formed cases. It means that unless we write friend DY.Core.Bytes.AnyLabel, we can't reason with the concrete value of any_label, which makes its actual value less suspicious?
Would that look better to you?

TWal · 2024-09-04T14:02:18Z

Now asking the thoughts of @cwaldm and @fabian-hk: do you think the impact on the proofs is fine? Especially considering this brings other benefits (such as a nice equational theory on labels done in #56 which avoid having to deal with equivalent).

qaphla · 2024-09-04T15:13:41Z

Thanks! We could make an interface file like DY.Core.Bytes.AnyLabel.fsti that would contain
val any_label: label
and use that in get_label for the ill-formed cases. It means that unless we write friend DY.Core.Bytes.AnyLabel, we can't reason with the concrete value of any_label, which makes its actual value less suspicious? Would that look better to you?

I think that makes it sufficiently less suspicious for now (though I do still want to think more about if there's a way to cleanly work with but hide options) --- in particular, because it prevents anyone from unintentionally using that label, regardless of what it is. I'm a bit uncertain if any_label or no_label, or maybe unknown_label is most appropriate, but i lean towards unknown_label at the moment, even though it is viable to put an arbitrary label there (the justification for any_label, presumably?

TWal · 2024-09-05T09:57:10Z

I have done that in the DY.Core.Label.Unknown module.

In #55 I did something a bit fancier, so that reasoning on unknown_label would require reasoning Collatz conjecture, so that there is no temptation to friend DY.Core.Label.Unknown and use its actual value :)

qaphla

Looks good to me.

I think unknown_label should be sufficiently self-documenting in get_label, especially since one can easily follow back to the fsti file for more information.

src/core/DY.Core.Bytes.fst

qaphla

I've looked over the updates, and this (still) looks fine to me, though of course feedback from others would also be welcome.

cwaldm

Looks fine to me as well!

feat: make labels erasable

206598b

TWal requested a review from a team as a code owner September 1, 2024 22:19

TWal mentioned this pull request Sep 1, 2024

feat: generalize labels to be predicates #55

Merged

TWal requested a review from qaphla September 2, 2024 07:01

qaphla previously approved these changes Sep 3, 2024

View reviewed changes

TWal requested review from fabian-hk and cwaldm September 4, 2024 14:00

add unknown label

327ee8f

TWal dismissed qaphla’s stale review via 327ee8f September 5, 2024 09:20

qaphla previously approved these changes Sep 5, 2024

View reviewed changes

Merge branch 'main' into twal/erased_labels_3

63b55e1

TWal dismissed qaphla’s stale review via 63b55e1 September 9, 2024 16:28

qaphla reviewed Sep 16, 2024

View reviewed changes

src/core/DY.Core.Bytes.fst Outdated Show resolved Hide resolved

qaphla previously approved these changes Sep 16, 2024

View reviewed changes

cwaldm previously approved these changes Sep 18, 2024

View reviewed changes

fabian-hk previously approved these changes Sep 18, 2024

View reviewed changes

TWal mentioned this pull request Sep 23, 2024

Document and regroup flags that enable SMT patterns #64

Open

improve user interface of enable_bytes_well_formed_smtpats

52c84d4

TWal dismissed stale reviews from fabian-hk, cwaldm, and qaphla via 52c84d4 September 23, 2024 13:46

TWal merged commit e11a607 into main Sep 23, 2024
1 check passed

TWal deleted the twal/erased_labels_3 branch September 23, 2024 13:56

TWal mentioned this pull request Sep 25, 2024

feat: introduce has_usage to improve bytes invariant #67

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: make labels erasable #54

feat: make labels erasable #54

TWal commented Sep 1, 2024

TWal commented Sep 2, 2024

qaphla commented Sep 2, 2024

TWal commented Sep 3, 2024

qaphla commented Sep 3, 2024

qaphla left a comment

TWal commented Sep 4, 2024

TWal commented Sep 4, 2024

qaphla commented Sep 4, 2024

TWal commented Sep 5, 2024

qaphla left a comment

qaphla left a comment

cwaldm left a comment

feat: make labels erasable #54

feat: make labels erasable #54

Conversation

TWal commented Sep 1, 2024

Digression on has_label

TWal commented Sep 2, 2024

qaphla commented Sep 2, 2024

TWal commented Sep 3, 2024

qaphla commented Sep 3, 2024

qaphla left a comment

Choose a reason for hiding this comment

TWal commented Sep 4, 2024

TWal commented Sep 4, 2024

qaphla commented Sep 4, 2024

TWal commented Sep 5, 2024

qaphla left a comment

Choose a reason for hiding this comment

qaphla left a comment

Choose a reason for hiding this comment

cwaldm left a comment

Choose a reason for hiding this comment

Digression on `has_label`