Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v4] CryptoUtil refactoring opportunities. #3386

Open
2 tasks done
teo-tsirpanis opened this issue Jul 11, 2024 · 3 comments
Open
2 tasks done

[v4] CryptoUtil refactoring opportunities. #3386

teo-tsirpanis opened this issue Jul 11, 2024 · 3 comments
Labels
feature-request A feature should be added or improved. p2 This is a standard priority issue v4

Comments

@teo-tsirpanis
Copy link

Describe the feature

I read the code in the CryptoUtil.cs file and noticed that there are some things that could be improved.

Use Case

Cleaning-up the codebase and improving maintainability.

Proposed Solution

  • Use System.IO.Hashing to implement CRC hashing.
    • There are multiple possible ways to use it:
      • Always use it and depend on it by default – the easiest way to implement; there will be an extra dependency but it can be trimmed if unused
      • Add an AWSSDK.Extensions.System.IO.Hashing package that provides an alternative implementation.
  • Add overloads that accept spans.
  • Remove the managed MD5 implementation.
    • It was added 9 years ago purportedly to add FIPS support, but I don't think a managed implementation can be FIPS-compliant.
  • CryptoUtil could even be removed from the public API but that would be a needless breaking change.

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

AWS .NET SDK and/or Package version used

v4

Targeted .NET Platform

All

Operating System and version

All

@teo-tsirpanis teo-tsirpanis added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Jul 11, 2024
@dscpinheiro dscpinheiro added the v4 label Jul 12, 2024
@bhoradc bhoradc added needs-review p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. needs-review labels Jul 12, 2024
@normj
Copy link
Member

normj commented Jul 23, 2024

Definitely are areas if performance improvements now that we are moving up the minimum .NET version. Do you know if the System.IO.Hashing is any faster then the one the SDK uses?

MD5 is challenging for .NET Framework in a FIPS environment. Like you said we added it 9 years ago because you can't instantiate the MD5 objects in .NET Framework in a FIPS environment. We are not abandoning .NET Framework as part of V4 so don't see us switching away.

@teo-tsirpanis
Copy link
Author

I ran some benchmarks for the CRC algorithms. For CRC32 the System.IO.Hashing implementation is significantly faster than aws-c-checksums on all tested sizes (16 bytes, 1KiB, 1Mib, 64Mib). System.IO.Hashing does not provide a CRC32C implementation but I wrote one with the help of the BitOperations.Crc32C method (introduced in .NET 8, will have to be polyfilled for earlier frameworks) and it was faster on small buffers and up to 6% slower on larger ones.

Results


BenchmarkDotNet v0.14.0, Windows 10 (10.0.19045.4842/22H2/2022Update)
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET SDK 8.0.400
  [Host]     : .NET 8.0.8 (8.0.824.36612), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.8 (8.0.824.36612), X64 RyuJIT AVX2


Method Size Mean Error StdDev Median Ratio RatioSD
Crc32SystemIoHashing 16 5.690 ns 0.0648 ns 0.0575 ns 5.705 ns 0.04 0.00
Crc32AwsCrt 16 143.815 ns 2.9040 ns 5.5252 ns 141.474 ns 1.00 0.05
Crc32SystemIoHashing 1024 52.922 ns 0.3838 ns 0.3205 ns 52.952 ns 0.08 0.00
Crc32AwsCrt 1024 633.793 ns 8.2753 ns 7.3358 ns 634.278 ns 1.00 0.02
Crc32SystemIoHashing 1048576 55,368.987 ns 501.3493 ns 444.4331 ns 55,121.002 ns 0.11 0.00
Crc32AwsCrt 1048576 515,646.456 ns 3,622.6510 ns 3,211.3858 ns 516,361.914 ns 1.00 0.01
Crc32SystemIoHashing 67108864 6,711,132.812 ns 130,469.2022 ns 169,646.8303 ns 6,697,514.844 ns 0.19 0.01
Crc32AwsCrt 67108864 34,787,717.778 ns 364,982.2117 ns 341,404.5874 ns 34,662,206.667 ns 1.00 0.01
Crc32cBitOperations 16 2.303 ns 0.0748 ns 0.1096 ns 2.265 ns 0.02 0.00
Crc32cAwsCrt 16 130.816 ns 1.3143 ns 1.2294 ns 131.086 ns 1.00 0.01
Crc32cBitOperations 1024 152.348 ns 1.4545 ns 1.2893 ns 152.631 ns 0.53 0.01
Crc32cAwsCrt 1024 286.162 ns 3.1830 ns 2.8217 ns 287.207 ns 1.00 0.01
Crc32cBitOperations 1048576 169,818.129 ns 1,980.2265 ns 1,755.4192 ns 170,636.597 ns 1.03 0.02
Crc32cAwsCrt 1048576 165,378.067 ns 3,055.0144 ns 3,000.4326 ns 165,917.371 ns 1.00 0.02
Crc32cBitOperations 67108864 12,268,516.198 ns 168,921.5765 ns 158,009.3476 ns 12,311,294.531 ns 1.06 0.02
Crc32cAwsCrt 67108864 11,553,061.667 ns 228,143.0248 ns 213,405.1270 ns 11,627,168.750 ns 1.00 0.03

Code

using System.IO.Hashing;
using System.Numerics;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using Aws.Crt.Checksums;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Running;

BenchmarkRunner.Run<Crc32Benchmark>();

[GroupBenchmarksBy(BenchmarkLogicalGroupRule.ByCategory)]
public class Crc32Benchmark
{
    [Params(16, 1024, 1 << 20, 1 << 26)]
    public int Size { get; set; }

    private byte[] _data = null!;
    
    [GlobalSetup]
    public void Setup()
    {
        _data = new byte[Size];
        Random.Shared.NextBytes(_data);
    }

    [Benchmark]
    [BenchmarkCategory("Crc32")]
    public uint Crc32SystemIoHashing()
    {
        return Crc32.HashToUInt32(_data);
    }
    
    [Benchmark(Baseline = true)]
    [BenchmarkCategory("Crc32")]
    public uint Crc32AwsCrt()
    {
        return Crc.crc32(_data);
    }

    [Benchmark]
    [BenchmarkCategory("Crc32c")]
    public uint Crc32cBitOperations()
    {
        return Crc32cBitOperations(_data);
    }
    
    [Benchmark(Baseline = true)]
    [BenchmarkCategory("Crc32c")]
    public uint Crc32cAwsCrt()
    {
        return Crc.crc32c(_data);
    }
    
    private static uint Crc32cBitOperations(ReadOnlySpan<byte> buffer)
    {
        ref byte start = ref MemoryMarshal.GetReference(buffer);
        ref byte end = ref Unsafe.Add(ref start, buffer.Length);
        uint crc = 0;
        while (Unsafe.ByteOffset(ref start, ref end) >= sizeof(ulong))
        {
            crc = BitOperations.Crc32C(crc, Unsafe.ReadUnaligned<ulong>(ref start));
            start = ref Unsafe.Add(ref start, sizeof(ulong));
        }
        while (Unsafe.ByteOffset(ref start, ref end) >= sizeof(uint))
        {
            crc = BitOperations.Crc32C(crc, Unsafe.ReadUnaligned<uint>(ref start));
            start = ref Unsafe.Add(ref start, sizeof(uint));
        }
        while (Unsafe.IsAddressLessThan(ref start, ref end))
        {
            crc = BitOperations.Crc32C(crc, start);
            start = ref Unsafe.Add(ref start, 1);
        }
        return ~crc;
    }
}

@teo-tsirpanis
Copy link
Author

The system's MD5 implementation becomes increasingly faster on larger buffer sizes. Considering the FIPS implications, we could try using it first, and add fall back to the managed implementation if it throws.


BenchmarkDotNet v0.14.0, Windows 10 (10.0.19045.4842/22H2/2022Update)
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET SDK 8.0.400
  [Host]     : .NET 8.0.8 (8.0.824.36612), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.8 (8.0.824.36612), X64 RyuJIT AVX2


Method Size Mean Error StdDev Median Ratio RatioSD
Md5System 16 425.6 ns 8.59 ns 17.74 ns 418.2 ns 0.98 0.04
Md5Managed 16 433.6 ns 5.85 ns 5.48 ns 432.6 ns 1.00 0.02
Md5System 1024 2,923.9 ns 58.19 ns 67.02 ns 2,935.5 ns 0.62 0.02
Md5Managed 1024 4,697.4 ns 91.26 ns 127.93 ns 4,677.5 ns 1.00 0.04
Md5System 1048576 2,546,400.4 ns 49,584.99 ns 50,920.13 ns 2,531,915.2 ns 0.61 0.02
Md5Managed 1048576 4,152,450.3 ns 78,730.29 ns 73,644.36 ns 4,173,361.3 ns 1.00 0.02
Md5System 67108864 163,596,564.7 ns 3,253,729.37 ns 3,341,340.51 ns 162,646,375.0 ns 0.58 0.01
Md5Managed 67108864 284,266,263.3 ns 5,055,510.86 ns 4,728,928.00 ns 284,251,250.0 ns 1.00 0.02

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved. p2 This is a standard priority issue v4
Projects
None yet
Development

No branches or pull requests

4 participants