Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unify random generation across Oscar #100

Open
rfourquet opened this issue May 4, 2020 · 17 comments
Open

unify random generation across Oscar #100

rfourquet opened this issue May 4, 2020 · 17 comments

Comments

@rfourquet
Copy link
Contributor

Having one RNG usable in all subsystems of Oscar would be useful.

Cf. discussion started at #84 (comment).
To quote the (current) last comment of it, from @fieker:

Bottom line: for now, we cannot do anything (no manpower). Long-term, it
needs to be sorted, as, at least for debugging and development,
reproducibility is important, thus using >>1 independent sources of
random that are totally wild is difficult to support long term

I suggest to have the discussion continue here.

@wbhart
Copy link
Contributor

wbhart commented May 4, 2020

What does it mean, "unify random generation". This is like saying we should have one function for all number theory. We are embarrassing ourselves.

We (currently) have two RNG's in Flint (and a number of randomisation strategies built on top of them):

  1. A pair of LCG's for very fast (but still high quality) pseudorandom generation (when appropriately combined as per Marsaglia). Two seeds are required to make use of this.

  2. The GMP Mersenne twister for generation of large random values. It has a reasonably low amortised cost, but has a fairly large internal state.

The approach we have taken thus far is about the most minimally stupid approach you can get away with. All our random functions have warnings on them that they are only useful for test code and not to be used for anything serious. Even that is only after making a number of stupid mistakes over the years that caused hangs or very poor test coverage due to us not knowing what we were doing.

We also now have functions for reasonable quality uniform distribution for where that matters (though it is not clear anyone actually trusts us to have gotten that right, and nor should they).

I fail to see what you even mean by having "one RNG" usable by all systems. There are all sorts of random generators for all sorts of different purposes. They aren't in any way "compatible".

Perhaps it might be useful to start by reviewing the ways in which RNG's are used in mathematics and by reviewing some algorithms for pseudorandom generation, e.g. see the high quality but simple generators of George Marsaglia.

Our random generation sucks, by the way. It's going to need some serious work in the future. It's as slow as crap and produces poor quality pseudorandom numbers. Doing it properly is a serious job, which is why we haven't.

@fieker
Copy link
Contributor

fieker commented May 4, 2020 via email

@rfourquet
Copy link
Contributor Author

What does it mean, "unify random generation". This is like saying we should have one function for all number theory. We are embarrassing ourselves.

This is not how I meant it, feel free to change the title to be more meaningful.

To build on the linked conversation, it means:

  1. at the very least, it would be useful that if I do Random.seed!(0), or e.g. Oscar.rng_seed!(0), and use any Oscar function which uses randomness, then the result is reproducible.
  2. Ideally, IMO, all functions in Oscar which use an RNG would accept an optional rng argument. In this way, if I don't trust the RNG provided by flint for example, I can pass my own RNG, for example a cryptographic one.

I fail to see what you even mean by having "one RNG" usable by all systems. There are all sorts of random generators for all sorts of different purposes. They aren't in any way "compatible".

In my understanding, there exist random numbers generators which are deemed good enough for general use (except for cryptography), and fast enough.

Our random generation sucks, by the way. It's going to need some serious work in the future. It's as slow as crap and produces poor quality pseudorandom numbers. Doing it properly is a serious job, which is why we haven't.

For Flint, I'm not sure what is best, but on the Julia/Oscar, I would investigate whether it's possible to use a Julia RNG as I mentioned in the other thread. I don't see a reason apriori why it wouldn't be possible (but again, I don't know the details here). Until proven otherwise, Julia's default RNG is good for general purpose (except cryptography), even number theory.
[And there is a Julia PR (https://github.com/JuliaLang/julia/pull/34852) to make a version of Xoshiro the new default, which is faster and would make enable reproducibility even in the presence of multithreaded programs; but the decision is not taken yet on this].

@wbhart
Copy link
Contributor

wbhart commented May 4, 2020

To have reproducibility for the user, you would have to either

  1. Have a function which sets all random generators to the same seed every time you run the tests, which will certainly get rid of your occasional failing tests, but which will also make random testing essentially useless.

  2. Provide the user with all seeds of all generators used, including the very big ones.

I'm telling you now that you cannot do this with one small seed and you cannot do it based on the time and you cannot do it with one RNG to rule them all.

It's a nice goal, but it's also pretty much impossible.

The best I can come up with, even in theory, is to have a single large seed (many bits, the number of which depends on the version of the random generator) which each generator uses as a source of entropy for seeding itself in a deterministic way based on a generator specific algorithm, which then gets saved to disk when the test framework has a failure (or perhaps every time the test suite is run), and then have a function which can reset all the generators based on such a saved token. Though exactly how you get the random values used for a given test is beyond me at present, as it will depend on the random values that came before it and how many of them there were. Seeding a generator so that it starts at a given point in its pseudorandom sequence is only practical for certain kinds of generators (one of many reasons there are so many approaches to random generation).

Let me just be absolutely clear about one thing: I can't make Flint accept an arbitrary RNG supplied from outside of Flint. That is absolutely impractical. Flint has to use different random generators for different purposes as things are. And this is only going to get worse as we improve things.

Once again, there is no single RNG which is good for everything. RNG's are a tradeoff between 1) various statistical measures of quality (there are many, and some matter more than others in certain contexts, e.g. patterns modulo certain small primes, randomness of various bit ranges, length of pseudorandom cycle, the likelihood that one kind of number will follow one of another kind, how normal the distribution is (sometimes not actually desirable), clustering of values in 2D, 3D and higher dimensions and many more) 2) speed to get an individual random number 3) speed to initialise the RNG 4) amortised cost of getting a random number 5) size of internal state (useful for starting at a given point in the pseudorandom sequence) 6) size of seed 7) how easy/fast uniform distribution (or other distributions) are to construct on top of them 8) (not usually relevant to us) cryptographic suitability.

And no, there are no fast RNG's that are high enough quality for all mathematical applications but which don't become the bottleneck in others. Just the other day I used Nemo to do a computation which required generating random numbers as part of the computation. Random generation was by far the bottleneck (by orders of magnitude). So whatever we are doing currently is absolutely awful.

We currently don't even try to solve the reproducibility thing in Flint at present. In BSDNT I think we "solved" it by always starting with the same seeds. The random numbers change only as you add new tests. (Don't quote me on it, but that's what I recall.) But it's suboptimal because you add a new test function and someone else's code breaks and you get blamed for it.

I strongly suggest we focus on things we can actually fix before "fixing" those we'd need to do a lot of research and absolutely huge quantities of implementation to solve. And in the mean time, we should not make big claims about what is essentially amateur hour.

@fieker
Copy link
Contributor

fieker commented May 4, 2020 via email

@wbhart
Copy link
Contributor

wbhart commented May 4, 2020

Having said all that, I do vaguely recall there was some trick you could do with a sha digest of a portion of some known fixed pseudorandom sequence into which you could index to get some kind of reproducibility. I'm sure there are papers on it. That might allow us to solve the reproducibility issue if we can dig it up.

@fieker
Copy link
Contributor

fieker commented May 4, 2020 via email

@wbhart
Copy link
Contributor

wbhart commented May 4, 2020

In this way, if I don't trust the RNG provided by flint for example, I can pass my own RNG, for example a cryptographic one.

I just want to point out that just because a crytopgraphic RNG was used as a source of randomness, does not mean that an application is "secure". Here is a function which takes a single word of your cryptographically secure RNG to generate two words of randomness:

void rand_one_to_two(ulong * out1, ulong * out2, ulong crypto_rand_in)
{
*out1 = crypto_rand_in;
*out2 = crypto_rand_in ^ 3;
}

There are actual real world examples of where stupid things like this have been done. Under no circumstances should a non-secure system like a CAS be used for anything secure, with or without a secure RNG!

On the flip side, there are these days algorithms for taking a number of poor random generators and combining them to make good ones (though I never followed up on whether they passed peer review, and I don't know how they work - but I do know they are very slow as all things crypto essentially have to be).

@wbhart
Copy link
Contributor

wbhart commented May 4, 2020

  1. Provide the user with all seeds of all generators used, including the very big ones.
    Yep - or, in Oscar, make it cascading: one small seed for a small/ simple RNG, to produce a large seed for the complicated one.

Sigh. What is the point of having an RNG with a large seed if the system seeds it with a small one!? There's a reason RNG's with a large seed use a large seed!

RNGs are like theorems in mathematics. You can't use proofs in contexts where you start with things that violate the assumptions on which those proofs are built. And if you can't prove anything, you shouldn't make any claims whatsoever about your system. In particular, if you break the RNG's used by Oscar we should not make the claim that Oscar is useful for investigating conjectures in mathematics, or that it is well-tested, etc.

@rfourquet
Copy link
Contributor Author

I just want to point out that just because a crytopgraphic RNG was used as a source of randomness, does not mean that an application is "secure".

Sure, I have enough crypto background to know that. My example was a bit extreme, the suggested idea was that with a crypto RNG, I'm (more) sure that I'm not using a bad PRNG that could be screwing my results.

@wbhart
Copy link
Contributor

wbhart commented May 4, 2020

@fieker Try to imagine a large supercomputer computation to collect statistics about a mathematical conjecture. Flint, by the way, uses two 64 bit seeds to generate 64 bit random numbers, because there are no known fast generators that are easy to implement which can do it with a single 64 bit seed. And that would not be suitable for such a supercomputer computation if you were serious about it. It might not even be good enough for a moderate sized department cluster if run long enough.

The only thing a small seed is good for is randomised test code. If you are doing something serious other than test code and you want to re-run your exact computation if it breaks, you are going to need something much better.

@wbhart
Copy link
Contributor

wbhart commented May 4, 2020

I think I remember how the sha digest thing works now. I think it only works if you have a single PRNG for a session. If you change PRNG in a session, or use multiple PRNG's I don't see how to make it work. (But I think the test code in bsdnt used a globally set PRNG, from a selection of possibilities.)

Before each test you print a sha digest of the internal state of the PRNG. This works for something like bsdnt because we were using the same seed every session. To reproduce a test you run the given PRNG from the initial seed until the sha matches that given before the failing test, and then you can restart that given test with the same internal state of the PRNG. In this way the size of the string presented to the user is independent of the PRNG used and independent of the size of its internal state or seed. This is useful for test code if you are using a PRNG with large internal state such as the Mersenne Twister.

This is of limited usefulness for us, but is a reasonable trick to know about.

@wbhart
Copy link
Contributor

wbhart commented May 4, 2020

And you don't need to print the entire SHA digest. Just the first word is sufficient for test code.

Of course this is only useful if all functions use the globally set PRNG, which was fine for bsdnt, not so much for a system stitched together from multiple parts. Whether there is a trick for that, I simply do not know. I think not, as any of the PRNG's could have run for any number of iterations and the combinatorial explosion guarantees you can't reconstruct the state from a digest in a short time.

Another hard problem to solve, which we didn't make much progress with is reproducibility across architectures (e.g. 32 vs 64 bits). We thought about it, but I don't think we ever implemented anything along those lines.

@fieker
Copy link
Contributor

fieker commented May 4, 2020 via email

@peteroupc
Copy link

peteroupc commented May 4, 2020

I want to note several things:

On reproducibility. Some applications, including tests, care about reproducibility. However, if there are things outside the application's control, reproducibility is often not achievable. Besides pseudorandom number generation, some things that affect reproducibility include:

  • Floating-point numbers (as opposed to fixed-point numbers or integers).
  • Undefined and undocumented behavior.
  • Multithreading and parallel processing.

I explain this in more detail in my section "Ensuring Reproducibility"; see also the references cited there.

The point is that it's not just the RNG that can get in the way of reproducible tests, simulations, etc. And if there is a way to avoid the need for reproducibility, rather than ensure reproducibility (and ensuring it is not exactly trivial), an application should follow that way.

On kinds of tests. Some tests, like fuzz tests, are designed to generate test cases with the help of randomness. For fuzz tests, reproducibility is generally not required unless the test case is impractical to store or distribute. If the generated test cases are stored and used in other tests (such as unit tests), it doesn't matter much what RNG was used to generate that test case, as long as it used a relatively large seed and had high statistical quality.

This is in contrast to unit tests, where randomness and nondeterministic behavior not controlled for should not determine whether the test passes or fails. Unit tests (as opposed to fuzz tests) have no need to generate test cases at random, especially if the code under test does not rely on random number generators at all (as stochastic optimization and Monte Carlo integration do, for example).

If a unit test tests code that uses randomness, it should also not be sensitive to the particular random generator the code uses. For example, rather than setting a seed for an RNG, the unit test could check whether values are acceptable within a given tolerance. By doing so, the unit test will no longer be tied to a particular RNG, so that the RNG can be changed without affecting the test or the application's functionality.

@lgoettgens
Copy link
Member

@fingolfin
Copy link
Member

I've added Oscar.randseed! in PR #2578, which can seed all RNGs we know about using an integer (i.e. a "small" seed). This does not attempt to provide a way to save & restore full states (although we could do that, if we really wanted, but I don't see much of a point in that right now)

Beyond that I don't really see anything in this issue that we can realistically do to make everything use "one rng", nor does it seem sensible to do so (because different things may need different RNGs with different properties). So I am strongly tempted to close this issue. But I'll wait at least until @fieker is back from vacation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants