Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When are asm and intrinsics worth it? #575

Closed
enkore opened this issue Mar 19, 2024 · 2 comments
Closed

When are asm and intrinsics worth it? #575

enkore opened this issue Mar 19, 2024 · 2 comments

Comments

@enkore
Copy link
Contributor

enkore commented Mar 19, 2024

I'm looking at doing a third implementation of sha256 for x86 targeting the x86-64-v3 ISA level (AVX, AVX2, but no AVX512 and no SHA-NI, i.e. Haswell), because the pure-rust soft implementation isn't doing so well without SHA-NI. This raised the question when this effort is sensible.

For example, there is a Loongarch64 asm implementation for SHA-256, but it's actually scalar (I believe the LA64 vector instructions aren't even publicly documented) and as a result only about 10% faster than the pure-rust implementation. On the other end of the scale are implementations using dedicated instructions, like SHA-NI or AES-NI, which can be 1000% or more faster. Where's the line? Is there one?

@tarcieri
Copy link
Member

There are some tough tradeoffs indeed.

We get pretty frequent complaints about performance when it isn't on par with ASM implementations. See e.g. #327.

Intrinsics add per-platform testing/maintenance burden via redundant implementations of the same algorithm, which also introduces the possibility of per-platform defects. But at least they're Rust code, which makes them accessible to other Rust programmers. ASM has all of the same problems, but has the additional complications of being a separate language from Rust (and obviously lacking its many guarantees around type/memory safety), and having to determine the correct arguments to asm! when using inline assembly.

Regarding the path forward on ASM, which is still an open question since we've removed the old non-inline ASM implementations, personally I've been interesting in finding the safest possible way to consume ASM, particularly looking at projects which provide formally verified ASM implementations for a wide variety of algorithms and platforms where we could extract those implementations in an automated manner and transform them into Rust asm! syntax, or perhaps even have the upstream tooling generate Rust code directly. Some projects of this nature for the specific case of hashes are AWS-LC and HACL*.

This does have the disadvantage that these formally verified implementations tend to lag behind the fastest hand-optimized ASM implementations, and that's also a debatable tradeoff. It might also impact FIPS certification, for those who care about that.

@newpavlov
Copy link
Member

I think we can close this issue as non-actionable.

In general, if ASM/intrinsics backend provides statistically significant performance improvements, we are likely to use it. We also may use verified assembly even if it does not improve performance, but we will consider doing it on case-by-case basis.

@newpavlov newpavlov closed this as not planned Won't fix, can't repro, duplicate, stale Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants