Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JAR fingerprinting #3

Merged
merged 21 commits into from
Jun 5, 2024
Merged

Add JAR fingerprinting #3

merged 21 commits into from
Jun 5, 2024

Conversation

jssblck
Copy link
Member

@jssblck jssblck commented Jun 5, 2024

Overview

Note

This diff isn't quite as insane as it seems; the files under tests/it/jar/expect/*.jar just consist of the expected results from the tests in tests/it/jar.rs.

  • Adds support for JAR fingerprints.
  • Refactors for Sparkle's version of fingerprints.
  • Adds more tests.
  • Refactors library for better organization.

JARs are fingerprinted in two ways:

  1. By their raw contents. Each file inside is extracted and hashed; then sorted by their file names. Each hash is then hashed in order to derive a final hash for the overall JAR.
  2. By the raw contents of their .class files. Each file inside is extracted and hashed; then sorted by their file names. Each hash is then hashed in order to derive a final hash for the overall JAR. Files that do not end with .class are ignored.

Acceptance criteria

We can fingerprint JARs by their contents in a platform independent mannter.

Testing plan

Relying on automated tests.

Automated tests were constructed based on https://fossa.atlassian.net/browse/ANE-1701, which features JARs provided by users that we want to be certain we can work with.

Risks

None in particular.

References

https://fossa.atlassian.net/browse/ANE-1709

Checklist

  • I added tests for this PR's change (or explained in the PR description why tests don't make sense).

@jssblck jssblck requested a review from a team as a code owner June 5, 2024 05:18
@jssblck jssblck requested review from csasarak and removed request for a team June 5, 2024 05:18

#[tracing::instrument(level = tracing::Level::DEBUG, fields(entry = ?entry.name()), ret)]
fn is_class_file(entry: &ZipFile<'_>) -> bool {
entry.name().ends_with(".class")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure when a class file wouldn't have this extension, but technically there is a magic-number you can read out as well: https://en.wikipedia.org/wiki/Java_class_file#Magic_Number

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea though, i'll use that, seems better

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, actually, to do that requires refactoring how to do the overall filter operation (since there's no way to "unread" those header bytes from the file).

I think I'm going to leave it as is for v1 of these fingerprints but we can always make a v2 that uses the magic number if we want later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds fine to me. I was pointing it out more as a random bit of information that could be useful later but isn't really important now.

@jssblck jssblck merged commit db8c366 into main Jun 5, 2024
5 checks passed
@jssblck jssblck deleted the jic branch June 5, 2024 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants