-
Notifications
You must be signed in to change notification settings - Fork 235
A Strategy for Inserting returns in Verification Conditions
F*'s verification condition generation is, for the most part, syntax directed. Every let
yields a bind
in a VC; applications are monadic, yielding bind
s of their own; lifts are inserted when sequentially composing computations in different monads, etc.Although the bind
s and lift
s are determined syntactically, it's less clear when exactly a return
should be inserted. After all, in our formalizations of F*'s type system, the rule of return
looks something like this:
G |- e : Tot t
----------------------------
G |- e : Pure t (return e)
which is clearly not syntax directed.
F*'s default implementation of this rule is in fact (roughly) to insert a return on every Tot value, see:
- here, followed by many other ad hoc insertions of returns, carefully crafted after much experience, but without any overarching principle. e.g., here, here, here, here, here etc.
But, this is very inefficient, from multiple perspectives:
-
Each return introduces an indirection in a VC, as defined by Prims.pure_return, which introduces a quantifier as an equality hypothesis.
-
Each return turns a Tot term into a Pure term (GTot into Ghost), disabling various optimizations that apply when sequentially composing Tot/GTot terms, leading to further bloat.
-
The additional indirections are hard to predict, making it difficult to write tactics to process VCs after the fact, as noticed especially in the context of Vale.
-
The indirections block normalization, causing more proofs to have to be done equationally by the SMT solver.
In what follows, I'll mainly refer to Pure terms, treating Tot terms as a sub-class of Pure terms. What I say applies equally to Ghost and GTot terms.
Pure terms are self-specifying---F* and SMT can reason directly about their definitions. Naming sub-terms of a pure term in a VC is usually not necessary. As such, inserting a return on every Tot term is unnecessary. Instead, let's aim to delay the insertion of a return until
a. the point where a pure sub-term p
of a larger impure term is lifted to the impure effect of its context, since the larger impure term cannot be returned itself. Inserting a return on p
at that point allows reasoning about the impure term in a VC while making use of the definition of its pure sub-term p
.
b. the point where a pure term's computed type and effect are compared against its expected type and effect. A return at that point allows proving that the term has its expected type and effect relying on both on "intrinsic reasoning", using its computed type and effect, as well as "extrinsic reasoning", using its definition.
With this in mind here's how returns are now inserted in F*.
First, rather than the simple return rule above, FStar.TypeChecker.Util.maybe_assume_result_eq_pure_term
implements something that's closer to this:
G |- e : Pure t wp [when should_return e t]
----------------------------------------------------------------------------
G |- e : Pure t (bind wp (fun x -> forall y. y==x /\ y == e ==> post y))
-
Consider term
e
=let x = (e1 <: M1 t1 wp1) in (e2 : Pure t2 wp2)
. IfM1
is notPure
, then the resulting term is notPure
either ande
cannot be returned in its entirety. So, we insert a return on its pure sub-terme2
. See here (check_inner_let) and here (sequencing with;
) -
Consider the application
(e_f <: Pure (x:t1 -> Pure t2 wp2)) (e_x <: M t1 wp1)
, whereM
is notPure
. This is interpreted as the monadic applicationlet f = e_f in let x = e_x in f x
. As in the previous case, although the pure sub-termf x
is part of a larger impure term (because ofe_x
in effectM
). So, we insert a return onf x
. See here (monadic_application) -
Consider
match x with P1 -> (e1 : Pure t1 wp1) | P2 -> (e2 : M t1 wp2)
. Again, the entire match cannot be returned, since it has an impure sub-term. So, we insert a return on the pure branche1
. See here (tc_eqn) and its call-site (bind_cases). -
Consider the term
e
=let rec f x = e_f in (e' : Pure t wp)
. Thelet rec
always binds a lambda term, so it is alwaysTot
(even though, of course,e_f
may itself be effectful). Now, althoughe
itself is aPure
, due to a limitation of the SMT encoding, local let rec terms cannot appear in a VC. (See issue #1362). This means that until the SMT encoding is capable of encoding inner let recs,e
cannot be returned in its entirety. So, we insert a return around its pure sub-terme'
. See here (check_inner_let_rec)
-
When proving that a value
(v : t)
has its expected typet'
, we DO NOT insert a return on the value. Instead, we compute a predicateguard : t -> Type
which is sufficient to establisht <: t' and simply apply
guard v`. See here (value_check_expected_typ and here (check_and_ascribe). -
When checking that a computation
(e : M t wp)
has an expected effectM' t' wp'
, we try to insert a return one
first (i.e., in caseM
isPure
andshould_return
is true etc.) See here (check_expected_effect) -
When subsuming the result type of a computation
(e : M t wp)
to some other typet'
, whereguard: t -> Type
is sufficient to establisht <: t'
, we build the VCM.bind wp (fun x -> lift_Pure_M (Pure.assert (guard x) (Pure.return x)))
. See here (comp_check_expected_typ) and here (weaken_result_typ).
All returns (except the return mentioned in D.3 above) are guarded by a should_return
check. See its implementation here.
should_return (e : M t wp)
is true when all of the following conditions are true:
-
M
is pure (or ghost) - The return type
t
is notunit
orsquash
- The head symbol of
e
is not markedirreducible
-
e
does not contain alet rec
Additionally, in maybe_assume_result_eq_pure_term
, we include a check to ensure that in addition to should_return (e : M t wp)
inserts a return only if wp
doesn't already contain a return.
Aside from the strategy described above, the implementation of bind
contains many special cases, some of which cause also cause a return to be inserted.
The following is an adaptation of this code comment.
Consider sequentially composing let x = (e1 : Pure t1 wp1) in (e2 : M2 t2 wp2)
.
In case should_return (e1 : Pure t1 wp1)
is false, then we'll just build M.bind (lift_Pure_M wp1) (fun x -> lift_M2_M wp2)
.
Otherwise, we aim to build the following VC: M.bind (lift_Pure_M wp1) (x == e1 ==> lift_M2_M (wp2[e1/x]))
. The reason for substituting e1
for x
in wp2
is to allow proofs there to proceed by just computing with e1
.
The additional equality hypothesis x = e1
may seem redundant, but it's not. The post-condition in wp1
or its result type t1
type may carry some meaningful information about e1
. Then, it's important to weaken wp2
with the equality, so that whatever property is proven about the result of wp1
(i.e., x
) is still available in the proof of wp2
.
However, we apply a few optimizations:
a. if wp1
is already a return or a partial return, then it already provides this equality, so there's no need to add it again and we instead generate M.bind (lift_Pure_M wp1) (lift_M2_M (wp2[e1/x])
.
b. if wp1
has a trivial post-condition (e.g., it is a strengthening of a Tot
computation), then the post-condition does not carry any useful information. We have two sub-cases:
i. In case the user option vcgen.optimize_bind_as_seq = without_type
we instead generate M.assert_wp (wp1 (\x. True)) (lift_M2_M (wp2[e1/x]))
ii. Although the post-condition of wp1
does not carry useful information, its result type might. When applying the optimization above, the SMT solver is faced with reconstructing the type of e1
. Usually, it can do this, but in some cases (e.g., if the result type has a complex refinement), then this optimization can actually cause a VC to fail. So, we add an option to recover from this, at the cost of some VC bloat. In case the user option vcgen.optimize_bind_as_seq = with_type
, M.assert_wp (wp1 (\x. True)) (lift_M2_M (wp2[Prims.with_type e1 t1/x]))
where with_type e1 t1
, decorates e1
with its type before substituting. This allows the SMT solver to recover the type of e1
(using a primitive axiom about with_type
), without polluting the VC with an additional equality. Note, specific occurrences of with_type e t
can be normalized away to e
if requested explicitly by a user tactic.
Here is the effect of these optimizations.
module Vcgen
assume val f1 : x:nat -> y:int{y >= x}
assume val f2 : x:nat -> y:int{y >= x}
assume val f3 : x:nat -> y:int{y >= x}
#set-options "--max_fuel 0 --max_ifuel 0 --debug Vcgen --debug_level SMTQuery --print_full_names"
let test123 (x1:int{x1 > 1}) =
let x2 = f1 x1 in
let x3 = f2 x2 in
let x4 = f3 x3 in
assert (x4 >= x1)
(* With --vcgen.optimize_bind_as_seq without_type
Encoding query formula: forall (x1293:(x1294:Prims.int{ x1294 > 1 })).
(*could not prove post-condition*)
(*Subtyping check failed; expected type Prims.nat; got type (x1240:Prims.int{ x1240 > 1 })*)
x1293 > 1 ==> x1293 >= 0 /\
(*Subtyping check failed; expected type Prims.nat; got type (y244:Prims.int{ y244 >= x1158 })*)
Vcgen.f1 x1293 >= x1293 ==> Vcgen.f1 x1293 >= 0 /\
((*Subtyping check failed; expected type Prims.nat; got type (y248:Prims.int{ y248 >= x2182 })*)
Vcgen.f2 (Vcgen.f1 x1293) >= Vcgen.f1 x1293 ==> Vcgen.f2 (Vcgen.f1 x1293) >= 0 /\
(Vcgen.f3 (Vcgen.f2 (Vcgen.f1 x1293)) >= x1293))
--------------------------------------------------------------------------------
Default:
Encoding query formula: forall (x1413:(x1414:Prims.int{ x1414 > 1 })).
(*could not prove post-condition*)
(*Subtyping check failed; expected type Prims.nat; got type (x1240:Prims.int{ x1240 > 1 })*)
x1413 > 1 ==> x1413 >= 0 /\
(forall (any_result415:(x1416:Prims.int{ x1416 > 1 })).
x1413 == any_result415 ==>
(forall (any_result417:(y418:Prims.int{ y418 >= x1413 })).
Vcgen.f1 x1413 == any_result417 ==>
(*Subtyping check failed; expected type Prims.nat; got type (y244:Prims.int{ y244 >= x1158 })*)
Vcgen.f1 x1413 >= x1413 ==> Vcgen.f1 x1413 >= 0 /\
(forall (any_result419:(y420:Prims.int{ y420 >= x1413 })).
Vcgen.f1 x1413 == any_result419 ==>
(forall (any_result421:(y422:Prims.int{ y422 >= Vcgen.f1 x1413 })).
Vcgen.f2 (Vcgen.f1 x1413) == any_result421 ==>
(*Subtyping check failed; expected type Prims.nat; got type (y248:Prims.int{ y248 >= x2182 })*)
Vcgen.f2 (Vcgen.f1 x1413) >= Vcgen.f1 x1413 ==> Vcgen.f2 (Vcgen.f1 x1413) >= 0 /\
(forall (any_result423:(y424:Prims.int{ y424 >= Vcgen.f1 x1413 })).
Vcgen.f2 (Vcgen.f1 x1413) == any_result423 ==>
(forall (any_result425:(y426:Prims.int{ y426 >= Vcgen.f2 (Vcgen.f1 x1413) })).
Vcgen.f3 (Vcgen.f2 (Vcgen.f1 x1413)) == any_result425 ==>
Vcgen.f3 (Vcgen.f2 (Vcgen.f1 x1413)) >= x1413))))))
*)