Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an optional Soufflé backend #179

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

ecstatic-morse
Copy link

@ecstatic-morse ecstatic-morse commented Aug 19, 2021

Allows polonius to execute Datalog programs directly using Soufflé.

In the long term, that means:

  • We can test that the results of our datafrog computations are correct, even intermediate ones like subset.
  • We can prototype new ideas more quickly. No need to do stratification or re-indexing by hand before seeing results.

This PR binds to Soufflé's C++ interface using cxx. As a result, we can initialize relations and extract results entirely in memory; no need to write intermediate results to disk. If I were to start from scratch, I might choose generate Soufflé binaries instead of libraries and serialize everything on tmpfs. It took a lot of work to get cxx, Rust, and the generated C++ code to play nicely.

This adds an Engine enum that can be used to select between the 🐸 and 🍮 (no "souffle" emoji unfortunately) backends. Algorithm variants that are implemented in 🍮 take an Engine field now, which can be changed to 🍮 by adding the -Souffle suffix to the -A command line argument or env variable. 🍮 is gated behind a feature flag (polonius-souffle) and is off by default, since it requires a working souffle installation. The frontend types will still be available when the feature is disabled, but trying to compute using them will cause a runtime panic. Both configurations (with and without the feature) are tested on CI.

We do comparison tests on the output of the Souffle versions of DatafrogOpt and Naive. LocationInsensitive is a no-brainer, I just haven't copied those rules over yet.

This is not the cleanest integration, and I wish I could share more code for dumping intermediate facts, checking results, etc. I think it's okay for an MVP though.

@ecstatic-morse ecstatic-morse force-pushed the polonius-souffle branch 2 times, most recently from b8d1f90 to 3f5f3d7 Compare September 24, 2021 02:54
...instead of bare indexes. This makes some macros easier to write.
@ecstatic-morse ecstatic-morse force-pushed the polonius-souffle branch 6 times, most recently from be46dbd to dfbf34b Compare September 24, 2021 17:14
@ecstatic-morse ecstatic-morse marked this pull request as ready for review September 24, 2021 17:36
@ecstatic-morse
Copy link
Author

ecstatic-morse commented Sep 24, 2021

@lqd @nikomatsakis Ready for review 😟.

It looks like there's a bug in the optimized variant of the datalog rules. I'll try to figure it out.

@ecstatic-morse ecstatic-morse force-pushed the polonius-souffle branch 3 times, most recently from a029ed5 to adf933c Compare September 24, 2021 17:47
@ecstatic-morse
Copy link
Author

ecstatic-morse commented Sep 29, 2021

It's starting to seem like there's a typo or a transcription error in the optimized rules. Both 🍮 and 🐸 compute the same number of subset facts for vec_push_ref::foo1, but don't agree on the other intermediate relations, specifically dying_can_reach.

Datafrog
# live_to_dying_regions
# 1
"\'_#10r"	"\'_#16r"	"Mid(bb11[5])"	"Start(bb12[0])"
# dying_region_requires
# 11
"\'_#6r"	"Mid(bb6[3])"	"Start(bb6[4])"	"bw0"
"\'_#7r"	"Mid(bb11[2])"	"Start(bb11[3])"	"bw1"
"\'_#10r"	"Mid(bb12[4])"	"Start(bb8[0])"	"bw0"
"\'_#14r"	"Mid(bb6[7])"	"Start(bb8[0])"	"bw0"
"\'_#14r"	"Mid(bb7[1])"	"Start(bb10[0])"	"bw0"
"\'_#14r"	"Mid(bb9[0])"	"Start(bb10[0])"	"bw0"
"\'_#14r"	"Mid(bb11[4])"	"Start(bb11[5])"	"bw0"
"\'_#15r"	"Mid(bb11[5])"	"Start(bb8[0])"	"bw1"
"\'_#15r"	"Mid(bb11[5])"	"Start(bb12[0])"	"bw1"
"\'_#17r"	"Mid(bb11[5])"	"Start(bb8[0])"	"bw0"
"\'_#17r"	"Mid(bb11[5])"	"Start(bb12[0])"	"bw0"
# dying_can_reach
# 17
"\'_#17r"	"\'_#10r"	"Mid(bb11[5])"	"Start(bb8[0])"
"\'_#16r"	"\'_#10r"	"Mid(bb11[5])"	"Start(bb12[0])"
"\'_#17r"	"\'_#10r"	"Mid(bb11[5])"	"Start(bb12[0])"
"\'_#6r"	"\'_#14r"	"Mid(bb6[3])"	"Start(bb6[4])"
"\'_#7r"	"\'_#15r"	"Mid(bb11[2])"	"Start(bb11[3])"
"\'_#17r"	"\'_#16r"	"Mid(bb11[5])"	"Start(bb8[0])"
"\'_#16r"	"\'_#16r"	"Mid(bb11[5])"	"Start(bb12[0])"
"\'_#17r"	"\'_#16r"	"Mid(bb11[5])"	"Start(bb12[0])"
"\'_#14r"	"\'_#17r"	"Mid(bb11[4])"	"Start(bb11[5])"
"\'_#17r"	"\'_#8r"	"Mid(bb11[5])"	"Start(bb8[0])"
"\'_#16r"	"\'_#8r"	"Mid(bb11[5])"	"Start(bb12[0])"
"\'_#17r"	"\'_#8r"	"Mid(bb11[5])"	"Start(bb12[0])"
"\'_#15r"	"\'_#22r"	"Mid(bb11[5])"	"Start(bb8[0])"
"\'_#17r"	"\'_#22r"	"Mid(bb11[5])"	"Start(bb8[0])"
"\'_#15r"	"\'_#22r"	"Mid(bb11[5])"	"Start(bb12[0])"
"\'_#16r"	"\'_#22r"	"Mid(bb11[5])"	"Start(bb12[0])"
"\'_#17r"	"\'_#22r"	"Mid(bb11[5])"	"Start(bb12[0])"
# dying_can_reach_live
# 5
"\'_#6r"	"\'_#14r"	"Mid(bb6[3])"	"Start(bb6[4])"
"\'_#7r"	"\'_#15r"	"Mid(bb11[2])"	"Start(bb11[3])"
"\'_#14r"	"\'_#17r"	"Mid(bb11[4])"	"Start(bb11[5])"
"\'_#16r"	"\'_#10r"	"Mid(bb11[5])"	"Start(bb12[0])"
"\'_#17r"	"\'_#10r"	"Mid(bb11[5])"	"Start(bb12[0])"
# dead_borrow_can_reach_root
# 2
"\'_#6r"	"Mid(bb6[3])"	"bw0"
"\'_#7r"	"Mid(bb11[2])"	"bw1"
# dead_borrow_can_reach_dead
# 4
"\'_#6r"	"Mid(bb6[3])"	"bw0"
"\'_#7r"	"Mid(bb11[2])"	"bw1"
"\'_#14r"	"Mid(bb6[3])"	"bw0"
"\'_#15r"	"Mid(bb11[2])"	"bw1"
Souffle
# live_to_dying_regions
# 1
"\'_#10r"	"\'_#16r"	"Mid(bb11[5])"	"Start(bb12[0])"
# dying_region_requires
# 0
# dying_can_reach
# 4
"\'_#16r"	"\'_#10r"	"Mid(bb11[5])"	"Start(bb12[0])"
"\'_#16r"	"\'_#16r"	"Mid(bb11[5])"	"Start(bb12[0])"
"\'_#16r"	"\'_#8r"	"Mid(bb11[5])"	"Start(bb12[0])"
"\'_#16r"	"\'_#22r"	"Mid(bb11[5])"	"Start(bb12[0])"
# dying_can_reach_live
# 1
"\'_#16r"	"\'_#10r"	"Mid(bb11[5])"	"Start(bb12[0])"
# dead_borrow_region_can_reach_root
# 2
"\'_#6r"	"Mid(bb6[3])"	"bw0"
"\'_#7r"	"Mid(bb11[2])"	"bw1"
# dead_borrow_region_can_reach_dead
# 4
"\'_#6r"	"Mid(bb6[3])"	"bw0"
"\'_#7r"	"Mid(bb11[2])"	"bw1"
"\'_#14r"	"Mid(bb6[3])"	"bw0"
"\'_#15r"	"Mid(bb11[2])"	"bw1"

If the problem really is that they were written down incorrectly in datafrog_opt, I don't think I'm going to be able to fix it: the tuple re-indexing makes the datafrog version inscrutable to me.

@lqd
Copy link
Member

lqd commented Sep 29, 2021

There was this datafrog join using another join's intermediate relation, that I thought we may have commented incorrectly, but it seemed equivalent to the datalog rules, and unrelated to dying_can_reach in any case.

It's been so long that I don't remember if we actually tested these opt rules in soufflé (my recollection is: I probably only did use the Naive rules, but maybe Niko did use the Opt ones), and the old datalog version of the opt variant I have looks too old/incorrect (at some point there also were differences between Naive and Opt in some edge cases, and these bugs have since been fixed).

I'll also try to look at these rules.

@lqd
Copy link
Member

lqd commented Sep 29, 2021

I've noticed this: dying_can_reach depends on a few relations that seem to be correct with soufflé, and dying_can_reach_origins which I expect to cause issues. This relation must be different between the 2 versions, right ?

dying_can_reach_origins is:

Maybe this rule wasn't extracted because of the _loan syntax used here, rather than _ elsewhere ? (I'm not sure how you created the opt.dl file)

dying_can_reach_origins(origin, point1, point2) :-
   dying_region_requires(origin, point1, point2, _loan).

@ecstatic-morse
Copy link
Author

ecstatic-morse commented Sep 29, 2021

Ah, good find! I started looking for more missing rules and found that the base case for requires also got lost. That should fix it.

To extract the datalog, I just stripped everything that wasn't a comment from the relevant part of the .rs file and then rearranged things manually. I'm not sure how I messed that up.

@lqd
Copy link
Member

lqd commented Sep 29, 2021

🎉

With the old (default) resolver, you need to pass the `polonius-souffle`
feature to each workspace crate individually when compiling.
- Filter reflexive subset relations
- Mark placeholder regions as live
@ecstatic-morse

This comment has been minimized.

@ecstatic-morse ecstatic-morse force-pushed the polonius-souffle branch 4 times, most recently from 11e1a00 to 11e25f5 Compare October 12, 2021 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants