Skip to content

Large scale separate verification & compilation

Jonathan Protzenko edited this page Jan 4, 2017 · 14 revisions

This is a summary of a lengthly discussion between Cédric Fournet, (jet-lagged) Tahina Ramananandro, and Jonathan Protzenko (current author).

Problem statement

With Project Everest making steady progress, we are now facing the problem of large-scale F* developments, and how to achieve some degree of modularity. This is important for:

  • performance (parsing and lax-checking a huge dependency graph can make verifying even a standalone file difficult)
  • collaboration (we want people working on, say, HACL/lib, to be independent from people working on, say, miTLS)
  • proofs (we want to say that we can verify miTLS without having access to any of the low-level modules from HACL/lib).

The running example is miTLS, which wants to verify against HACL/lib (née low-level/crypto), but really only uses a handful of functions, not the whole pyramid of modules.

A look at the dependency graph

Ideally, we want the following dependency graph:


`-,
   \                      Crypto.fsti  <---   miTLS (contains "Crypto.foo") 
C1  `-,  <--,                 ^                      |
       \     \                |                      |
  C2    | <--'----,-----  Crypto.fst   <-------------'        
        |    ____,|
  C3   /  <-'    /
C4  ,-'  <------'
   /
  |                              (a dependency graph -- artist's view)
,-'
  • there are series of cryptographic modules C1 .. C4;
  • miTLS only needs a handful of functions exported by Crypto.fsti;
  • Crypto.fsti does NOT depend on any of the crypto modules C1 ...C4; rather, it uses a combination of abstract types and type abbreviations to expose a standalone cryptographic interface
  • the implementation of Crypto.fsti, that is, Crypto.fst, provides concrete functions and contains a myriad of dependencies against each of the crypto modules C1 through C4.

This means that we can split verification, extraction and build in two steps:

  • verify C1 .. C4, Crypto.fst and Crypto.fsti together
  • verify miTLS against Crypto.fsti ONLY.

The latter step is a net improvement, because we operate in a special mode where:

  • the dependency graph is cut off (the arrow from miTLS to Crypto.fst is gone), meaning that we ignore all the C1 .. C4 files
  • we have a bulletproof argument that miTLS won't traverse the abstraction to use some unsafe low-level crypto primitive (F* doesn't even read the files)
  • people can work on C1 .. C4 separately.

Achieving modularity right now

Right now, when faced with both Crypto.fst and Crypto.fsti, F* generates a dependency on both files. We introduce a new option, --partial, that controls whether the arrow from miTLS to Crypto.fst exists in the dependency graph or not.

Verification

We have:

  • --verify_all: this implies NOT(--partial), because we want to verify all the project
  • --verify_module Mi: this implies --partial, because we're only interested in specific modules Mi; the dependency graph may have several connected components, but each Mi MUST be in the graph (JP: check that this is indeed a hard error)
  • default behavior: verify modules passed on the command-line only; rely on the user-provided --partial to determine whether to follow dependency arrows from interfaces to implementations

Workflow:

  • Verifying the Crypto part means invoking fstar Crypto.fst, which discovers the entire dependency graph.
  • Verifying miTLS against Crypto means invoking fstar miTLSMain.fst --partial, which stops at the Crypto.fsti boundary.

Implementation note: this is a two-liner because the code for --partial is still there with a note saying that this is the --stratified behavior.

Corner cases:

  • fstar --partial foo.fsti foo.fst (still verifies both)
  • fstar --partial --verify_module Foo foo.fsti foo.fst (must verify the interleaving of the two)
  • fstar --interactive (implies --partial)
  • fstar foo.fsti (does NOT pick up foo.fst because the arrow is from the fst to the fsti)

Extraction

We have:

  • `--no_extract``
  • --codegen (OCaml|FSharp|Kremlin)

Workflow:

  • fstar --codegen OCaml Crypto.fst for the crypto library
  • fstar --partial --codegen OCaml miTLSMain.fst for miTLS

Since the latter invocation "sees" only Crypto.fsti, and since having only an fsti in the dependency graph means "realized module, don't extract" (see F★ Interfaces (simple version) for a reminder), this means that the latter invocation of F* generates ML modules that assume a properly-implemented Crypto.ml, which is great because we produced it using the first invocation of F*.

Build

We can even go further, and build a crypto.cmxa (OCaml) and crypto.a (Kremlin) for the crypto library, and never rebuild it.

Going further: F* bundles

This is all fine, but:

  • no one enforces that Crypto.fsti is up-to-date with regards to the Crypto modules. That is, miTLS may keep verifying against a modified Crypto.fsti that no one verified against Crypto.fst
  • proliferation of files

Cédric suggests (for the long-term), a fstx package that contains:

  • Crypto.fsti
  • crypto.cmxa
  • crypto.a
Clone this wiki locally