Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash fuzzed input to detect duplicates? #11

Open
mgold opened this issue Aug 4, 2018 · 2 comments · May be fixed by #207
Open

Hash fuzzed input to detect duplicates? #11

mgold opened this issue Aug 4, 2018 · 2 comments · May be fixed by #207
Labels
Design Question Needs design discussion fuzzers Concerns randomness or simplifiers

Comments

@mgold
Copy link
Collaborator

mgold commented Aug 4, 2018

When generating random values from a fuzzer, there is no guarantee that each one will be unique. You may ask for 100 cases but get less. It may be possible you get much less.

One solution is to hash each input, store the hashes, and reject inputs with a duplicate hash. We'd need to fail the test after some number of failed attempts to create distinct inputs, perhaps max 20 (2*numberOfRequestedRuns).

Since we'd want a designated union type tag for this failure condition, it makes sense to do this while we're doing a major revision.

Is there any interest in exploring this idea?

@mgold mgold added the Design Question Needs design discussion label Aug 4, 2018
@drathier
Copy link
Collaborator

drathier commented Aug 5, 2018

There definitely is, but I'd like to pair this with knowing roughly how many possible values a fuzzer can produce. No reason to keep generating booleans to try to find a third value. Also, we probably want to generate more values if we're fuzzing a huge thing, like a Dict (Int, Int) (List String).

@mgold mgold added the fuzzers Concerns randomness or simplifiers label Aug 24, 2018
@Janiczek
Copy link
Collaborator

Janiczek commented Jul 27, 2022

No reason to keep generating booleans to try to find a third value.

The status quo is that you still keep generating booleans, and run the toExpectation function to boot.

I think we could do this optimization separately (retry generating if we've already tested an input for a test -- skipping some toExpectation calls), and then there is the separate issue of generating all values exhaustively if the fuzzer allows it. I'll create an issue for that one as I and @gampleman have some thoughts around it already :)

Edit: #188

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Design Question Needs design discussion fuzzers Concerns randomness or simplifiers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants