Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's Fix Hashing Again (Damn) #19

Merged
merged 5 commits into from
Feb 13, 2024
Merged

Let's Fix Hashing Again (Damn) #19

merged 5 commits into from
Feb 13, 2024

Conversation

Sewer56
Copy link
Member

@Sewer56 Sewer56 commented Feb 12, 2024

Simple PR which:

  • Fixes hashing of chunked blocks by changing XxHash64Algorithm _hash = new(); to XxHash64Algorithm _hash = new(0);
  • Adds missing asserts which compare Nx header's hash values against the hash value of extracted data, an oversight on my part.
  • Fixed very rare bug where thread pre-emption by the OS (under high load) would cause chunked blocks (with multiple chunks) to produce the wrong hash.
    • Caused by out of order calls to AppendHash and GetFinalHash.
      • Calls to these methods are now synchronized (thread safe) for ChunkedBlockState.
        • Has no measurable impact on performance in benchmarks. Due to how library pipelines blocks for compression.
      • No change needed for SolidBlock(s).
    • Found by stress testing over 30+ minutes of time on the SMIM dataset on all threads.
      • CPU load needed to be ~100% to reproduce.
  • Added a stress test against SMIM which you can manually run (if desired).
    • This is disabled by default with a Skip test directive, but can be re-enabled if needed.
    • The SMIM dataset uses all of Nx' block types (Solid, Chunked with 1 Chunk, Chunked with Multiple Chunks).
    • Hashes compared against in this stress test were made by xxh64sum, the official binary.
  • Bumps xxHash64 dependency.

+ Added optional sanity test with real data just in case.
+ Added missing asserts from some existing tests.
@Sewer56 Sewer56 added the meta-bug Something isn't working label Feb 12, 2024
@Sewer56 Sewer56 requested a review from a team February 12, 2024 23:18
@Sewer56 Sewer56 self-assigned this Feb 12, 2024
@Sewer56 Sewer56 changed the title Fix hashing again damnit Let's Fix Hashing Again (Damn) Feb 12, 2024
@codecov-commenter
Copy link

codecov-commenter commented Feb 12, 2024

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (b2a1cef) 91.26% compared to head (fdfefa9) 91.51%.
Report is 10 commits behind head on main.

Files Patch % Lines
NexusMods.Archives.Nx/Utilities/Compression.cs 50.00% 5 Missing ⚠️
...ods.Archives.Nx/Structs/Blocks/ChunkedFileBlock.cs 80.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #19      +/-   ##
==========================================
+ Coverage   91.26%   91.51%   +0.25%     
==========================================
  Files          42       41       -1     
  Lines        1374     1297      -77     
==========================================
- Hits         1254     1187      -67     
+ Misses        120      110      -10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Sewer56 Sewer56 marked this pull request as draft February 12, 2024 23:48
@Sewer56 Sewer56 marked this pull request as ready for review February 13, 2024 01:27
@halgari halgari merged commit 09270d7 into main Feb 13, 2024
13 checks passed
@halgari halgari deleted the fix-hashing-again-damnit branch February 13, 2024 04:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta-bug Something isn't working
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants