Large scale separate verification & compilation

This is a summary of a lengthly discussion between Cédric Fournet, (jet-lagged) Tahina Ramananandro, and Jonathan Protzenko (current author).

New: the new behavior has been implemented! See What is verified, what is extracted?

Problem statement

With Project Everest making steady progress, we are now facing the problem of large-scale F* developments, and how to achieve some degree of modularity. This is important for:

performance (parsing and lax-checking a huge dependency graph can make verifying even a standalone file difficult)
collaboration (we want people working on, say, HACL/lib, to be independent from people working on, say, miTLS)
proofs (we want to say that we can verify miTLS without having access to any of the low-level modules from HACL/lib).

The running example is miTLS, which wants to verify against HACL/lib (née low-level/crypto), but really only uses a handful of functions, not the whole pyramid of modules.

A look at the dependency graph

Consider the following dependency graph. Arrows are found by the dependency discovery mechanism, starting from miTLS as the root of the traversal.


`-,
   \                      Crypto.fsti  <---   miTLS (contains "Crypto.foo") 
C1  `-,  <--,                 ^                      |
       \     \                |                      | !!dubious arrow!!
  C2    | <--'----,-----  Crypto.fst   <-------------'        
        |    ____,|
  C3   /  <-'    /
C4  ,-'  <------'
   /
  |                              (a dependency graph -- artist's view)
,-'

The gist of the proposal is to add an option to control whether the "dubious arrow" exists or not in the dependency graph. In other words, what it means to "have a dependency on Crypto".

The situation is as follows:

there are series of cryptographic modules C1 .. C4;
miTLS only needs a handful of functions exported by Crypto.fsti;
Crypto.fsti does NOT depend on any of the crypto modules C1 ...C4; rather, it uses a combination of abstract types and type abbreviations to expose a standalone cryptographic interface
the implementation of Crypto.fsti, that is, Crypto.fst, provides concrete functions and contains a myriad of dependencies against each of the crypto modules C1 through C4.

This means that we can split verification, extraction and build in two steps:

verify C1 .. C4, Crypto.fst and Crypto.fsti together
verify miTLS against Crypto.fsti ONLY.

The latter step is a net improvement, because we operate in a special mode where:

the dependency graph is cut off (the dubious arrow from miTLS to Crypto.fst is gone), meaning that we ignore all the C1 .. C4 files
we have a bulletproof argument that miTLS won't traverse the abstraction to use some unsafe low-level crypto primitive (F* doesn't even read the files)
people can work on C1 .. C4 separately.

Achieving modularity right now

Right now, when faced with both Crypto.fst and Crypto.fsti, F* generates a dependency on both files. We introduce a new option, --partial, that controls whether the arrow from miTLS to Crypto.fst exists in the dependency graph or not.

Verification

We have:

--verify_all: this implies NOT(--partial), because we want to verify all the project
--verify_module Mi: this implies --partial, because we're only interested in specific modules Mi; the dependency graph may have several connected components, but each Mi MUST be in the graph (JP: check that this is indeed a hard error)
default behavior: verify modules passed on the command-line only; rely on the user-provided --partial to determine whether mentioning "Crypto" generates a dependency on Crypto.fsti only OR Crypto.fsti and Crypto.fst

Workflow:

Verifying the Crypto part means invoking fstar Crypto.fst
Verifying miTLS against Crypto means invoking fstar miTLSMain.fst --partial, which stops at the Crypto.fsti boundary.

Implementation note: this is a two-liner because the code for --partial is still there with a note saying that this is the --stratified behavior.

Corner cases:

fstar --partial foo.fsti foo.fst (still verifies both)
fstar --partial --verify_module Foo foo.fsti foo.fst (must verify the interleaving of the two)
fstar --interactive (implies --partial)
fstar foo.fsti (does NOT pick up foo.fst because the arrow is from the fst to the fsti)

Extraction

We have:

`--no_extract``
--codegen (OCaml|FSharp|Kremlin)

Workflow:

fstar --codegen OCaml Crypto.fst for the crypto library
fstar --partial --codegen OCaml miTLSMain.fst for miTLS

Since the latter invocation "sees" only Crypto.fsti, and since having only an fsti in the dependency graph means "realized module, don't extract" (see F★ Interfaces (simple version) for a reminder), this means that the latter invocation of F* generates ML modules that assume a properly-implemented Crypto.ml, which is great because we produced it using the first invocation of F*.

Build

We can even go further, and build a crypto.cmxa (OCaml) and crypto.a (Kremlin) for the crypto library, and never rebuild it.

Notes

--partial controls what it means to "take a dependency" on Crypto. The dependency arrow from an fst to an fsti always exists, meaning that fstar foo.fst picks up foo.fsti (and interleaves) while fstar foo.fsti does NOT pick up foo.fst and verifies a standalone interface.

Implementation note (JP): this is a known incorrect behavior that has had a corresponding TODO item on my whiteboard for several months now.

Going further: F* bundles

This is all fine, but:

no one enforces that Crypto.fsti is up-to-date with regards to the Crypto modules. That is, miTLS may keep verifying against a modified Crypto.fsti that no one verified against Crypto.fst
proliferation of files

Cédric suggests (for the long-term), a fstx package that contains:

Crypto.fsti
crypto.cmxa
crypto.a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly