Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More efficient egraphs/dags in assembly equivalence checker #1207

Open
JasonGross opened this issue Apr 20, 2022 · 4 comments
Open

More efficient egraphs/dags in assembly equivalence checker #1207

JasonGross opened this issue Apr 20, 2022 · 4 comments

Comments

@JasonGross
Copy link
Collaborator

JasonGross commented Apr 20, 2022

We should consider how to get better asymptotics on the equivalence checker, given that we now have invocations that take nearly an hour to run.

For example, in p448-mul-callgrind-with-source.7z.zip, we have
image
image

1.5M calls to merge_node taking 18% of the total time, about 40% of this time is spent in indexof and another 14% is spent in N.of_nat. We could presumably make this a lot more efficient by storing (a) an efficient map of N to nodes (based on FMapPositive), (b) the total number of nodes present (as an N), and (c) an trie mapping ops (more generally nodes) to indices in the dag. In light of #1193, we will probably also want to store a thunked assoc list mapping description strings to lists of dag indices. We may also need to associate with this data a proof that it all lines up; if we need to carry this around it should either live in dag_ok or it should be encoded as a boolean equality (or perhaps True -> check_good ... = true to avoid vm_compute and native_compute eagerly evaluating the proofs).

@JasonGross
Copy link
Collaborator Author

I would upload perf record data for p434, but perf.data is almost 9GB, and the output of perf script is 158 GB, and trying to make a flame graph OOMs (see brendangregg/FlameGraph#282).

@JasonGross
Copy link
Collaborator Author

@JasonGross
Copy link
Collaborator Author

And here's for p256:
p256-mul.perf.data.tar.gz
p256-mul-perf-edited-super-folded

@JasonGross
Copy link
Collaborator Author

I tried doing this in #1342 / #1293 (comment) , but it made things slower. More performance profiling required

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant