Skip to content

Commit

Permalink
AVX-512 support for RSA Signing (#1273)
Browse files Browse the repository at this point in the history
This change adds AVX-512 support for RSA 2k, 3k and 4k signing. It is
built around the use of AVX512_IFMA within the [(Almost) Montgomery
Multiplication](https://eprint.iacr.org/2011/239) implementation that
comprises the modular exponentiation part of the RSA algorithm. It is
ported from the [OpenSSL
patch](openssl/openssl#13750).

On C6i instance, clang 12, Release build:
Before:
Did 832 RSA 2048 signing operations in 1009511us (824.2 ops/sec)
Did 41000 RSA 2048 verify (same key) operations in 1019103us (40231.5 ops/sec)
Did 30000 RSA 2048 verify (fresh key) operations in 1007956us (29763.2 ops/sec)
Did 3684 RSA 2048 private key parse operations in 1067692us (3450.4 ops/sec)
Did 340 RSA 3072 signing operations in 1051690us (323.3 ops/sec)
Did 13000 RSA 3072 verify (same key) operations in 1087695us (11951.9 ops/sec)
Did 16000 RSA 3072 verify (fresh key) operations in 1005781us (15908.0 ops/sec)
Did 1870 RSA 3072 private key parse operations in 1017467us (1837.9 ops/sec)
Did 128 RSA 4096 signing operations in 1015724us (126.0 ops/sec)
Did 10000 RSA 4096 verify (same key) operations in 1071670us (9331.2 ops/sec)
Did 6952 RSA 4096 verify (fresh key) operations in 1016484us (6839.3 ops/sec)
Did 1110 RSA 4096 private key parse operations in 1092991us (1015.6 ops/sec)
After:
Did 1690 RSA 2048 signing operations in 1025072us (1648.7 ops/sec)
Did 63000 RSA 2048 verify (same key) operations in 1008785us (62451.4 ops/sec)
Did 54000 RSA 2048 verify (fresh key) operations in 1000298us (53983.9 ops/sec)
Did 8000 RSA 2048 private key parse operations in 1000938us (7992.5 ops/sec)
Did 550 RSA 3072 signing operations in 1012078us (543.4 ops/sec)
Did 30000 RSA 3072 verify (same key) operations in 1022061us (29352.5 ops/sec)
Did 27000 RSA 3072 verify (fresh key) operations in 1037663us (26020.0 ops/sec)
Did 4140 RSA 3072 private key parse operations in 1006526us (4113.2 ops/sec)
Did 253 RSA 4096 signing operations in 1050767us (240.8 ops/sec)
Did 18000 RSA 4096 verify (same key) operations in 1057742us (17017.4 ops/sec)
Did 15000 RSA 4096 verify (fresh key) operations in 1000483us (14992.8 ops/sec)
Did 2510 RSA 4096 private key parse operations in 1004408us (2499.0 ops/sec)

There is currently no support for 8k, so no change there. However, this
could be a follow on if there is interest in that.

Call-outs:
This patch is primarily additive modulo a small logic change that occurs in `mod_exp()` in `rsa_impl.c`,
where, previously, the calls to `mod_montgomery` and `BN_mod_exp_mont_consttime` were
interleaved. Here, in order to make possible the parallel exponentiations, `r1` is kept around and a new 
`BIGNUM`, `r2`, is created on the context.

---------

Co-authored-by: Nevine Ebeid <[email protected]>
Co-authored-by: Nevine Ebeid <[email protected]>
  • Loading branch information
3 people authored Sep 17, 2024
1 parent 9d21f38 commit e22cf50
Show file tree
Hide file tree
Showing 31 changed files with 17,998 additions and 2,730 deletions.
11 changes: 10 additions & 1 deletion crypto/fipsmodule/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@ if(ARCH STREQUAL "x86_64")
p256_beeu-x86_64-asm.${ASM_EXT}
rdrand-x86_64.${ASM_EXT}
rsaz-avx2.${ASM_EXT}
rsaz-2k-avx512.${ASM_EXT}
rsaz-3k-avx512.${ASM_EXT}
rsaz-4k-avx512.${ASM_EXT}
sha1-x86_64.${ASM_EXT}
sha256-x86_64.${ASM_EXT}
sha512-x86_64.${ASM_EXT}
Expand Down Expand Up @@ -147,6 +150,9 @@ if(PERL_EXECUTABLE)
perlasm(p256_beeu-armv8-asm.${ASM_EXT} ec/asm/p256_beeu-armv8-asm.pl)
perlasm(rdrand-x86_64.${ASM_EXT} rand/asm/rdrand-x86_64.pl)
perlasm(rsaz-avx2.${ASM_EXT} bn/asm/rsaz-avx2.pl)
perlasm(rsaz-2k-avx512.${ASM_EXT} bn/asm/rsaz-2k-avx512.pl)
perlasm(rsaz-3k-avx512.${ASM_EXT} bn/asm/rsaz-3k-avx512.pl)
perlasm(rsaz-4k-avx512.${ASM_EXT} bn/asm/rsaz-4k-avx512.pl)
perlasm(sha1-586.${ASM_EXT} sha/asm/sha1-586.pl)
perlasm(sha1-armv4-large.${ASM_EXT} sha/asm/sha1-armv4-large.pl)
perlasm(sha1-armv8.${ASM_EXT} sha/asm/sha1-armv8.pl)
Expand Down Expand Up @@ -175,6 +181,9 @@ if (CLANG AND (CMAKE_ASM_COMPILER_ID MATCHES "Clang" OR CMAKE_ASM_COMPILER MATCH
(CMAKE_C_COMPILER_VERSION VERSION_LESS "7.0.0") AND (ARCH STREQUAL "x86_64"))
set_source_files_properties(${CMAKE_CURRENT_BINARY_DIR}/aesni-gcm-avx512.${ASM_EXT} PROPERTIES COMPILE_FLAGS "-mavx512f -mavx512bw -mavx512dq -mavx512vl")
set_source_files_properties(${CMAKE_CURRENT_BINARY_DIR}/aesni-xts-avx512.${ASM_EXT} PROPERTIES COMPILE_FLAGS "-mavx512f -mavx512bw -mavx512dq -mavx512vl")
set_source_files_properties(${CMAKE_CURRENT_BINARY_DIR}/rsaz-2k-avx512.${ASM_EXT} PROPERTIES COMPILE_FLAGS "-mavx512f -mavx512bw -mavx512dq -mavx512vl -mavx512ifma")
set_source_files_properties(${CMAKE_CURRENT_BINARY_DIR}/rsaz-3k-avx512.${ASM_EXT} PROPERTIES COMPILE_FLAGS "-mavx512f -mavx512bw -mavx512dq -mavx512vl -mavx512ifma")
set_source_files_properties(${CMAKE_CURRENT_BINARY_DIR}/rsaz-4k-avx512.${ASM_EXT} PROPERTIES COMPILE_FLAGS "-mavx512f -mavx512bw -mavx512dq -mavx512vl -mavx512ifma")
endif()

# s2n-bignum files can be compiled on Unix platforms only (except Apple),
Expand Down Expand Up @@ -384,7 +393,7 @@ if(FIPS_DELOCATE)
# The flags are not required for any other compiler we are running in the CI.
if (CLANG AND (CMAKE_ASM_COMPILER_ID MATCHES "Clang" OR CMAKE_ASM_COMPILER MATCHES "clang") AND
(CMAKE_C_COMPILER_VERSION VERSION_LESS "7.0.0") AND (ARCH STREQUAL "x86_64"))
set_source_files_properties(${CMAKE_CURRENT_BINARY_DIR}/bcm-delocated.S PROPERTIES COMPILE_FLAGS "-mavx512f -mavx512bw -mavx512dq -mavx512vl")
set_source_files_properties(${CMAKE_CURRENT_BINARY_DIR}/bcm-delocated.S PROPERTIES COMPILE_FLAGS "-mavx512f -mavx512bw -mavx512dq -mavx512vl -mavx512ifma")
endif()

add_library(
Expand Down
1 change: 1 addition & 0 deletions crypto/fipsmodule/bcm.c
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@
#include "bn/prime.c"
#include "bn/random.c"
#include "bn/rsaz_exp.c"
#include "bn/rsaz_exp_x2.c"
#include "bn/shift.c"
#include "bn/sqrt.c"
#include "cipher/aead.c"
Expand Down
Loading

0 comments on commit e22cf50

Please sign in to comment.