-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue #759 : The output of the hash generators is misleading #880
base: 2.0
Are you sure you want to change the base?
Conversation
md5(), sha1() and sha256() generators is misleading
This change is incorrect, because it prevents seeding when using these functions, whereas the previous implementation was seedable. |
…ge the full output space of the respective hashes
@TimWolla I have corrected, seeding is OK now and I added some new test to garantee seedibility. I don't know if this is considered as BC and if this is OK. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than that bias remark I suspect that using ext/bcmath is a non-starter for this library.
I'm also not sure whether it's worth fixing the issue with the current architecture of FakerPHP. The fix is likely going to be pretty ugly when relying on the pre-PHP 8.2 standard library. See also: #15 (comment)
src/Provider/Miscellaneous.php
Outdated
return array_reduce(range(1, $n), static function ($carry, $item) { | ||
return bcmul($carry, self::numberBetween()); | ||
}, '1'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is heavily biased, which is easy to see by performing an exhaustive search for a smaller number range:
<?php
for ($a = 0; $a < 16; $a++)
for ($b = 0; $b < 16; $b++)
for ($c = 0; $c < 16; $c++)
@$results[$a * $b * $c]++;
var_dump($results);
If any of the multiplied numbers is zero, the result will also be zero, which makes zero the most likely result. But the others are not uniformly distributed either:
Thanks for the reviews @TimWolla. I have another idea, I will submit it later on. |
@TimWolla Alternatively, to avoid any BC, we could keep the legacy methods without modifying them and create new ones with a slightly different name and with the new behavior, for example :
We could use |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 1 week if no further activity occurs. Thank you for your contributions. |
@TimWolla hi, is it possible to review this PR ? |
md5(), sha1() and sha256() generators is misleading
What is the reason for this PR?
Author's checklist
Summary of changes
In order to leverage the full output space of the respective hashes, I use the the function random_bytes($length)
In order to keep those functions seedable, I add a $seed parameter that can be used to pass a parameter so that the function will always return the same hash given the same seed.
When this one is merged, I can take car of #761
Review checklist
CHANGELOG.md