Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the performance of the proofs in the Lean backend #250

Open
sonmarcho opened this issue Jun 18, 2024 · 3 comments
Open

Improve the performance of the proofs in the Lean backend #250

sonmarcho opened this issue Jun 18, 2024 · 3 comments

Comments

@sonmarcho
Copy link
Member

sonmarcho commented Jun 18, 2024

The proofs in the Lean backend are a bit slow for my taste. For instance, until recently, the proofs of the hashmap and the AVL took ~200s each, which is quite a lot. A more reasonable target (according to my intuition) would be less than a minute.

One issue for instance is the simplifier.

The simplifier can be slow at times especially when using simp_all in contexts with big goals. I did some preliminary diagnostics which indicate that it can sometimes perform up to ~10k simplification lemma attempts, which is not reasonable. We should investigate this.

In parallel, this seems to indicate that simp_all is maybe not the best way of "blasting" a proof: for instance, I often use it when what I actually need to do is some sort of congruence closure. Maybe we should investigate other proof styles in consequence.

@sonmarcho
Copy link
Member Author

I made some experiments with the proofs of the hashmap and the avl. Something I noticed is that by default the maxDischargeDepth is 2 which is actually a lot and should not be necessary in most situations. I tried using a maxDischargeDepth of 1 and was able to do so in all the proofs of the AVL, and 109 calls to simp_all out of 129 for the hashmap (without modifying the proofs). The consequence is that I reduced the verification time from ~200s in both cases to ~135s for the hashmap and ~160s for the AVL, which is quite big.

Generally speaking, I believe we need a general tactic to simplify the context, and simp_all may not be what we want. For instance, if I write simp_all [alloc.vec.Vec.new] (in the proof of the hashmap), what I really want is probably more something like: simp (config := {maxDischargeDepth := 1}) [alloc.vec.Vec.new] at *; simp_all config := {maxDischargeDepth := 1} only [LEMMAS_TO_SIMPLIFY_PROPOSITIONAL_LOGIC]`, and the latter is likely a lot more performant than the former.

I also made some measurements (before doing the modifications above). For instance, the graph below shows the accumulated time spent on evaluating the tactics of the proofs of the hashmap, when ranking the tactic times from the most expensive to the least expensive. There are some caveats (the times were printed with set_option profiler true and doesn't only print the time spent by evaluating tactics) and this was measured before doing the modifications I mentioned above. But one interesting thing to notice is that ~500 tactic calls (< 10% of all the calls) are responsible for ~150s of proof time (> 66% of the time spent), so there are probably a few spots that we can try to investigate to save a lot of verification time.

image

@sonmarcho sonmarcho changed the title Investigate the performance of simp Improve the performance of the profs in the Lean backend Aug 20, 2024
@sonmarcho
Copy link
Member Author

A tactic which is also worth investigating is scalar_tac. An obvious place where to look is the case disjunctions it performs. For instance, whenever it sees an instance of Scalar.toNat x it introduces the disjunction (Scalar.toNat x : Int) = x.val \/ (Scalar.toNat x : Int) = -x.val (if we convert x to a nat and lift this to an integer, then it is equal to x.val or -x.val), then does a case disjunction on it (or rather, let omega do a case disjunction). It can be expensive and there are places where this is not necessary. For instance, if x is an unsigned scalar (Usize, U32, etc.), then x.toNat` is always positive and we can eliminate one of the cases.

@sonmarcho sonmarcho changed the title Improve the performance of the profs in the Lean backend Improve the performance of the proofs in the Lean backend Aug 20, 2024
@sonmarcho
Copy link
Member Author

Another big issue is the theorems which use termination_by and decreasing_by clauses: after the proof is done, Lean spends an insane amount of time on the decreasing_by clause. This is a general Lean issue (not specific to Aeneas) and my intuition is that it needs to do so because it has to re-encode the whole theorem (and its proof) to use the proof of termination, so in the end it must do the proof twice. But it is worth investigating and seeing how we can improve that, either by pushing changes to the Lean developers, and developing strategies to minimize the overhead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant