Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize dettman algorithm #1601

Merged
merged 27 commits into from
May 19, 2023
Merged

Optimize dettman algorithm #1601

merged 27 commits into from
May 19, 2023

Conversation

OwenConoly
Copy link
Collaborator

This implements the optimization described in #1582.

This should resolve almost all of #1582; the only thing missing is the aesthetic improvement mentioned here about removing an unnecessary & operation, which I don't know how to fix.

Incidentally, I've realized that the dettman template, in its state before this PR, did unreasonable things when given a fractional limbwidth. I've rewritten it so that it will hopefully---I haven't tested this or anything---now behave reasonably given a fractional limbwidth. This happened to require the same sort of change that I was making to implement the optimization. This change should make the algorithm a bit easier to read, too.

@dderjoel
Copy link
Contributor

dderjoel commented May 3, 2023

FWIW, I am unsure if the current files in fiat-amd64/fiat_secp256k1_dettman_mul (and square respectively) would still be proven correct.
I've updated the files in CryptOpt already. Feel free to replace them, in case the validation does not succeed. I can then later provide the 3-best-per-architechture files just like for the other curves.

@OwenConoly
Copy link
Collaborator Author

Ah! I forgot about updating those. Indeed, they failed. I'll replace them with the CryptOpt files that you linked.

@OwenConoly
Copy link
Collaborator Author

I'm trying to add this enhancement, via a new and/or generalized rewriting rule, before merging this PR.
I've added some commits to this branch where I make some attempts at progress on that, so this branch is no longer in a good state to merge.

…tion should consist of just writing "register_width = 64" in Arith/DettmanMultiplcation. something broke.
function did rather unreasonable things when the limbwidth was fractional.

In this commit, I tried to fix this issue, but I ended up making the
reduce_carry_borrow function pretty obnoxious.  Also, fixing this issue
requires somewhat strange (and strong) hypotheses on the weight function.
We'd want, for instance,

(weight (2 * limbs - 2) / weight (limbs - 2)) mod s = 0.

I plan to revert my changes (from this commit) to the reduce_carry_borrow
function, so that it will continue to do unreasonable things given fractional
limbwidths.  This was not worth it.
…tation. wrote some (messy) arithmetic proofs. still more proofs to go in Arithmetic/DettmanMultiplication.v. Then need to change PushButtonSynthesis appropriately.
Replaced a bunch of references to (Z.log2_up s) with (Z.log2 s), since (in
the dettman_multiplication_mod_ops module) we now stipulate that s is a power
of 2.
…iminated the unnecessary 64-bit mask, but it also caused weird stuff to happen and killed off a bunch of 52-bit masks. Where'd they go? idk.
It now comes later in the rewriting process.
Apparently it's important that it comes after the fancy stuff.
For reasons that I don't understand, proofs in Barrett256 and Montgomery256 break if
I put it before.
@OwenConoly
Copy link
Collaborator Author

The dettman code generation actually works now (with the redundant & removed!), but the new rewriting pass that I've added has broken some other things.

Code generation for fiat-c/src/p521_32.c (among others, I assume) doesn't work---the error is with "converting to C", as you can see in the build of my most recent commit.

@andres-erbsen, @JasonGross - do you have any idea what would cause this? Might it be the same issue as in #1604?

@OwenConoly
Copy link
Collaborator Author

Oops, never mind my previous comment. I've inferred (correctly, I think?) from #1604 that my problem has something to do with dead code elimination, so I've removed the dead code elimination thing that I added in BoundsPipeline.v. This seems to have fixed the issue.

In this commit, I just removed the bit about dead code elimination that I
added in BoundsPipeline.v.
The C code synthesizes without error now, at least.
I haven't built everything yet, so we'll see if things actually work nicely.
by removing the dead-code-elimination thing]
@OwenConoly
Copy link
Collaborator Author

Looks like the redundant ands, which are removed by the new rewrite pass that I added, actually appeared in a lot of places other than the dettman code. Adding the new rewrite pass caused the total length of generated code to decrease by about 27000 lines. I was not expecting this, but I think it's a good thing.

@andres-erbsen
Copy link
Contributor

This cleanup is indeed more impressive than anticipated :D. Great job!

Copy link
Collaborator

@JasonGross JasonGross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than needing to update one more 3 into 4 for accurate error messages, this looks good to me!

src/PushButtonSynthesis/DettmanMultiplication.v Outdated Show resolved Hide resolved
src/PushButtonSynthesis/DettmanMultiplication.v Outdated Show resolved Hide resolved
src/Rewriter/RulesProofs.v Outdated Show resolved Hide resolved
@andres-erbsen andres-erbsen merged commit 1dc6b3a into master May 19, 2023
@andres-erbsen andres-erbsen deleted the dettman_avoid_wide_mul branch May 19, 2023 01:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants