Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread local prngs #4331

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

graydon
Copy link
Contributor

@graydon graydon commented May 22, 2024

This moves the global PRNG to thread-local, removes the separate PRNG inside the QIC, and also moves the global signature-verification cache to be thread-local (it probably doesn't have to be, but less inter-thread pollution and contention is probably better).

This is motivated by some work going on in #4258 -- where that PR moved the global signature cache to have yet another sub-PRNG, I decided to sketch out what it might look like to go the other way and just make the global PRNGs (and the signature cache) all thread-local. It seems to me a nicer approach, and more likely to avoid having to plumb PRNG-seeding paths through future stuff that happens to take random decisions. But it might also strike some as distasteful. Happy to discuss!

Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, but I have a question regarding the same seed on multiple different gRandomEngine instances.

#include <numeric>
#include <set>

namespace stellar
{

stellar_default_random_engine gRandomEngine;
thread_local stellar_default_random_engine
gRandomEngine(getLastGlobalStateSeed());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, it looks like it's very likely non-main threads will all get seeded with the same initial seed (unless reinitializeAllGlobalStateWithSeedInternal is called in between the threads spinning up). Is this an issue? I can imagine two potential issues:

  1. Attacker observes a PRNG value on thread 0 and now has advance knowledge of the next PRNG value on thread 1. (seems highly unlikely)
  2. Some work is being done. Part 1 of the work occurs on thread 0 and calls randomGenerate for some value. Part 2 of the work occurs on thread 1 and calls randomGenerate again, but receives the same value as before. Could this break an assumption that the work receives two different PRNG values?

Or am I like super overthinking this since it's PRNG and not RNG anyway...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not too worried about this for production: because we don't spin up threads on demand, fairly quickly threads will diverge (under the assumption that we generate a fairly good number of random numbers).

For tests, it's a different story: tests tend to spin up some "app" that will spin up threads, making it that now threads will be seeded the same way and will basically behave the exact same way in the context of that test. This has the potential to hide a large class of bugs.

A way to mitigate all those issues may just be to make seeds between threads different (but deterministic): when spinning up a new thread, generate a new seed from the main thread and use that seed to initialize the thread.

I guess this may be a new thing: until recently prngs were not really impacted by the scheduling of threads, now, I am not sure what the story is (and having TLS may be the way to get rid of the dependency with the OS scheduler).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants