Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forking instructions unnecessary for multi-process environments? #16

Open
hibachrach opened this issue May 20, 2023 · 4 comments
Open

Comments

@hibachrach
Copy link

After reading the code, it's unclear to me why the Statsig docs call for initializing the lib in a fork callback for Unicorn/Puma setups. Is the library using a shared system resource like a file descriptor or shared socket that isn't properly isolated via Ruby's copy-on-write?

If it is necessary, it's probably worth calling out for Sidekiq when in multi-process/"swarm" mode.

@vijaye-statsig
Copy link
Member

@daniel-statsig

@daniel-statsig
Copy link
Contributor

Hey @hibachrach, happy to help here. The SDK uses a singleton with multiple background threads to fetch values and flush logs. If the SDK is not initialized in these specific spots, it may end up in a invalid state and not work as intended.

Are you running into issues with initialization in Sidekiq that we need to fix, either with code or better documentation?

@hibachrach
Copy link
Author

No issues! Just confused on the guidance. When fork is called in Ruby, those child processes are memory-isolated by the runtime (at least in MRI) and trigger copy-on-write when changes occur. So Unicorn's forking, for example, shouldn't be able to invalidate/corrupt any state being fiddled by the various threads that Statsig is creating/switching between. The only exception would be OS level resources like sockets or file descriptors (and the file system, of course).

For a multithreaded environment, what you're describing makes total sense--there, memory is shared and care must be taken to avoid invalidation/corruption. But in a multi-process environment, it's not clear why this is necessary.

@daniel-statsig
Copy link
Contributor

after_fork is just a convenient place put the Statsig initialization as it would happen on Worker startup. This will ensure that each Worker has an initialized Statsig instance to interact with.

If you have a different, centralized location you would like to put it, it should be fine. The main thing to be wary of is that you must initialize before calling any other Statsig methods, otherwise an UninitializedError will be raised.

I believe the highlighted locations are valid call sites for initialize but I may be misunderstanding your meaning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants