-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About the avalanche bias #28
Comments
Others have pointed out issues with this project's metric (#21) though I'm not sure if that's common to all avalanche metrics. |
Oh interesting, I didn't see that issue! Yes cycles will appear and with a mere 32-bit state I guess the probability makes it likely in practice. For mixing or hashing arbitrary data I'm not sure how/if the cycles affects the result? In the case of using it as an PRNG I can see how (at least in a naïve implementation (state=mix(state)) some seeds should definitely be avoided. But maybe minimizing short cycles will lead to better mixing/hashing too? I haven't thought of that. The chi2 test will only be able to say pass/fail (p-values can't be used as a guiding metric). The way I have done it is to test the avalanche (pass/fail) at intervals (perhaps at power of two) and use how much data before failure as primary metric (then I add some other statistics to get more fine grained). But yeah, there might be better/more efficient ways. |
I wonder if something like Testing for goodness rather than lack of fit of continuous probability distributions would be more useful |
Yes maybe, it sounds like equivalence testing (new to me) possibly would fail the test earlier than traditional testing. But tbh I've no intuition about this. |
First I think this is a really awesome project! I stumbled across it some years ago when I was playing with a similar thing (automatically find good mixers) and I remember it as very inspiring!
So I found the lists of mixers (in the readme+an issue) and I though it would be a good correlation test to run through my Tests for Randomness (TFR) and PractRand, both using Pelle Evensen's harsh RRC-scheme to scramble the input bits.
I ran through +100 of the prospector mixers (love the plain text format plus 32-bit mixers are so quick to evaluate even with RRC), and I thought it might be interesting to share the results here.
What may come as a surprise is that the
original
32-bit prospector (from the article) seems to performing very well. Theoriginal
is the only one that goes to 2^18 in TFR the second runner up fails at 2^13 of all tested prospector mixers.This leads me to the question if something in the search code of prospector changed since the
original
was found?For reference, the furthest I've ever seen a
xmxmx
32-bit mixer go in RRC-PractRand is to 2^18 and in TFR to 2^19.Now PractRand is a bit of a blackbox to me, but since I am very familiar with TFR I investigated how the
low_bias
one fails. If you run it through TFR it will say something like:In the uniform test we expect a uniform distribution but the statistics says that this is very likely not the case. The
mix32::prospector_low_bias(reverse_co-31(counter-1))
indicates the used test sample (out of 128). This is a unitary counter has reversed bits + complement and rotated 31 bits (as per RRC). If we were to recreate this in code it would look something like this in c++ (copy pasted a bit so pardon the overkill with templates etc):This will output 2^10 values:
2731428922,3383114888,3653537803,1575712730,...
now if we examine those with chi2 statistics in Mathematica we get:PearsonChiSquareTest[low_bias_values, UniformDistribution[{0, 2^32-1}]]
this outputs a p-value close to zero (2.23e-22).Doing the same for the
original
mixer gives a p-value of 0.11 which means indistinguishable from uniform distribution. It's possible to see this with the naked eye, here are histograms of the output for the same rrc-stream from thelowbias
, theoriginal
andtrng
(true random data):Here it looks like the lowest bias version is not necessarily the best according to RRC-PractRand and TFR (see attached logs for many more examples). Maybe the bias calculated in prospector is an insufficient metric?
On a positive note, I strongly believe avalanche in some form is a sufficient test. In TFR, both
sac
andbic
are very strong indicators, while most others (uniform, gap, coupons etc ..) statistics will fail early or not at all. So maybe only some minor changes could help.I've attached two text files with logs for all prospector mixers I tested (tfr+practrand). The first column is how far the mixer came in the tests:
tfr_prospector.txt
pr_prospector.txt
This is more of a thought, if this seems interesting/relevant we can dig deeper, keep up the good work!
The text was updated successfully, but these errors were encountered: