250 likes | 393 Views
Random Numbers in HEP. Tests of Random Engines in CLHEP Heinrich’s note about RandEngine. Essential Concepts. Instances Use of a random in one module need not affect the sequence of randoms seen by another Engines vs Distributions An Engine is a source of uniform (pseudo-) randomness
E N D
Random Numbers in HEP Tests of Random Engines in CLHEP Heinrich’s note about RandEngine
Essential Concepts • Instances • Use of a random in one module need not affect the sequence of randoms seen by another • Engines vs Distributions • An Engine is a source of uniform (pseudo-) randomness • A Distribution uses an engine to fire off variates • Flat() • Gaussian() • Validity of Randomness • How {unbiased, ergodic, unpredictable} is the engine?
How Engines Can be Bad • Wrong single-number statistics • Wrong mean (c.f. the buggy Linux RandEngine) • Wrong sigma (or higher moment) • Very easy to detect such a major deficiency • Non-uniform distribution with right moments • Possible! (but mathematically pathological) • And certainly possible to an excellent approximation * • Sequence-order dependent flaws • Can be harder to detect • Can still lead to incorrect physics!
An engine with good moments but flawed distribution • Every engine on a real computer! • (But if the granularity is small enough this “flaw” becomes moot)
Typical Sequence Flaws • Tom Nash, doing a thesis in the 70’s, found that pairs of numbers from the “random” engine in use lied in stripes. • He was far from the first to notice this! • But it was affecting the physics
Testing an Engine • The general test: • Start with a mathematical observation: “If Sn is an ergodic sequence, uniformly distributed on (0,1), then for all subsequences T, some property P should hold.” • Property P should hold, to accuracy e(n) and with probability p(n), where e(n)0 and p(n) 1 for large n. • Apply the “Does P hold” test to a bunch of sequences generated by the engine. • An Engine that passes every possible test based on every possible property to perfect precision would be a perfect random engine • But mere mortals have to pick some set of tests • And have to settle for some finite precision on each test
The DIEHARD Suite • Invented (more like collected) by Marsaglia. • 13 tests (for 13 properties of the sequence). • Each test uses the sequence to generate a group of numbers, with mathematically known distribution for random sequences. • The distribution obtained is compared via a Kolmogorov-Smyrnoff test with the proper distribution. • If any P-value is extremely close to 1, this is proof of non-randomness of the sequence.
Interpreting the Results of a test suite • The longer the sequences used, the more discrimination power. • If no test fails, that is not proof that the sequence is random: • It is proof that either the tester was not clever enough to find and expose non-random properties, or that the sequence used was not long enough. • But for practical uses, some suite of tests at some level of N is “good enough.”
The DIEHARD Tests • Birthday Spacings • The likelihood of j birthdays being shared among m people in an n-day year is Poisson distributed with mean m3/(4n). • Not order-sensitive. • Overlapping 5-permutation Test • Each 5-number subsequence forms one of 120 possible ordering numbers. The count for each should be equal and uncorrelated. • Binary Rank Test (32x32 and 6x8) • The distribution of ranks of 32x32 bit matrices is known theoretically. This detects many linear realations involving up to 1000 bits of the sequence. • Bitstream Test • For a random stream of 221 bits, the number of “missing” 20-bit “words” should be normally distributed with mean 142,000 and sigma 428.
and… • Overlapping Pairs and Quadruples, Sparse Occupancy • Missing 2- and 4-”letter” words in a long sequence of “letters” from a 1024-letter alphabet • Not very order sensitive past N=20 or 40 • DNA Test • Missing “10-genome” sequences in the 4-letter genome alphabet • Count-the-1-’s test on a stream of or specific bytes • Uses the number of 1’s in 8-bit sequences, to form pseudo-letters, which are then grouped into 5-letter words. • Runs Test • The theoretical distributions and covariances fro up and down runs of N numbers are known.
and… • Overlapping Pairs and Quadruples, Sparse Occupancy • Missing 2- and 4-”letter” words in a long sequence of “letters” from a 1024-letter alphabet • Not very order sensitive past N=20 or 40 • DNA Test • Missing “10-genome” sequences in the 4-letter genome alphabet • Count-the-1-’s test on a stream of or specific bytes • Uses the number of 1’s in 8-bit sequences, to form pseudo-letters, which are then grouped into 5-letter words. • Runs Test • The theoretical distributions and covariances fro up and down runs of N numbers are known.
and… • Craps test • A long sequence of games of craps is simulated. The distribution of wins and losses, and of the number of dice throws for each game, is known. • Parking Lot Test • Make a number of attempts to park a circle in a random “lot” of a square of side 100. Re-try if there are collisions. After 120,000 tries, the number of successes shuld match some (experimentally determined) distribution. • Overlapping Sums Test • The sums of 100 variables should be virtually normally distributed. • Sensitive to medium-scale imperfections. • Squeeze Test • Number of iterations needed to reduce 231 down to 1, by doing k = ceiling(k*random()).
The Minimum Distance Test • The last two tests are the MD test in 2 and 3 dimensions: • Use N pairs (triplets) of randoms as coordinates of N points in the unit volume. • The closest pair of points has some distance d. • d has a “known” theoretical distribution • As applied by Marsaglia, most generators semi-fail the 2-D Minimum distance test!
The Minimum Distance Test • The last two tests are the MD test in 2 and 3 dimensions: • Use N pairs (triplets) of randoms as coordinates of N points in the unit volume. • The closest pair of points has some distance d. • d has a “known” theoretical distribution • As applied by Marsaglia, most generators semi-fail the 2-D Minimum distance test! • P-values of .99 are typical. • They did not semi-fail when the FORTRAN versions of the tests were applied. • The f2c version showed this behavior. • ?
The Minimum Distance Test • Marsaglia speculated this was due to some funny business (error) with converting array code using f2c. • Turns out it was a function of being able to do more iterations on later processors. • The theoretical distribution had been calculated only to lowest order. By the time ZOOM did these tests, we were exposing the fact that the actual distribution differed. • See my Technical memo TM2170, where I do the next few orders. This resolves the discrepancy.
The CLHEP Engines • JamesRandom • Fred James’ adaptation of Marsaglia’s add-with-carry generator. Large state. • Ranecu • Multiplicative Congruential generator using formula constants of L'Ecuyer. Large state. • Ranecu • Multiplicative Congruential generator using formula constants of L'Ecuyer. Loarge state. • Ranlux and Ranlux64 • Luescher’s “folding” generator. For luxury levels of 2+ it is excellent; for luxury 4 is “provably” ergodic. Moderate state. • TripleRand • Used in the Canopy Lattice gauge computations, this combines a Hurd primitive polynomial generator with two other unrelated generators to form a family of trustworthy yet distinct sequences. Moderate state.
The CLHEP Engines • JamesRandom • Fred James’ adaptation of Marsaglia’s add-with-carry generator. Large state. • Ranecu • Multiplicative Congruential generator using formula constants of L'Ecuyer. Large state. • Ranecu • Multiplicative Congruential generator using formula constants of L'Ecuyer. Loarge state. • Ranlux and Ranlux64 • Luescher’s “folding” generator. For luxury levels of 2+ it is excellent; for luxury 4 is “provably” ergodic. Moderate state. • TripleRand • Used in the Canopy Lattice gauge computations, this combines a Hurd primitive polynomial generator with two other unrelated generators to form a family of trustworthy yet distinct sequences. Moderate state.
More CLHEP Engines • Hurd288 and Hurd 160 • Based on primitive polynomials over Z(2). Provably ergodic (but I don’t know how well to trust the proof). Used as basis for TripleRand and DualRand. Small state. • RanshiEngine • Modeled on spinning balls. Very large state, but fast and excellent. • MTwistEngine • Based on number-theory considerations involving large Mersenne primes. Very Large state. Many randomness properties proven, though the math is difficult. Fast and excellent (can be made even faster…) • All the above engines have excellent randomness properties.
More CLHEP Engines • Hurd288 and Hurd 160 • Based on primitive polynomials over Z(2). Provably ergodic (but I don’t know how well to trust the proof). Used as basis for TripleRand and DualRand. Small state. • RanshiEngine • Modeled on spinning balls. Very large state, but fast and excellent. • MTwistEngine • Based on number-theory considerations involving large Mersenne primes. Very Large state. Many randomness properties proven, though the math is difficult. Fast and excellent (can be made even faster…) • All the above engines have excellent randomness properties.
And some Low-quality CLHEP Engines • DRand48Engine • Based on a common Unix system random number generator. Very small state and fast, but fails several randomness tests. • RandEngine • Directly calls the system function rand(). Not portable; state not savable. Its claim to fame is that the usual implementation of rand() is dissected in Knuth and shown to be a terrible generator!! • RandEngine, when coded properly, fails half of the DIEHARD tests • NonRandomEngine • Lets people test program behavior on a set forced sequence of randoms. Requires user to specify a sequence. Useful for testing. If you use this to try to generate true random numbers, you get what you deserve. • Why was RandEngine left in CLHEP? • According to a web page circa 1997, because Geant4 insisted on it. • Fortunately, they then proceeded not to use it.
How the CLHEP Engines do on DIEHARD • DRand48Engine failed the three “alphabet sequences” tests • indicating bad correllations at the range of 5-number subsequences • Ranlux at luxury level 0 (which was never recommended) failed the Birthday and Craps tests. • At luxury level above 1 it is fine
How the CLHEP Engines do on DIEHARD • Several engines semi-failed the Minimum Distance test • The more numbers we could afford to generate, the more likely the failure • We now understand the flaw • A summer student in 1998 coded the test in C++ from scratch, with my correction terms in the distribution, and all the “good” engines passed. • All engines other than RandEngine and DRand48 pass the entire DIEHARD Suite.
RandEngine • On systems until Linux, rand() used a standardized 15-bit linear congruence • Lame, see Knuth • But kind of OK. • At some point, an attempt was made in CLHEP to combine 2 calls to rand() by a shift and OR – to get 3 bits of randomness. • Still just as lame
RandEngine • This coding was blown, in that when Linux came along with a 32-bit result for rand(), the OR caused the average value returned to be 5/8!!!! • Results from programs run on Linux which used RandEngine must be discarded. • Fortunately, it appears nobody trusted RandEngine anyway • CDF uses Ranecu • Geant4 does not have RandENgine anywhere in there source code.
Recommendations • Ranecu is perfectly fine, no need to change. • Ranlux on level 2 is about as fast but “provably” reliable. • MTwist (if you have only a few instances so can afford a very large state), TripleRand, and JamesEngine are about twice as fast and very probably just as good or better. • Ranshi is faster still but has a huge state, and I’m not 100% convinced it can’t have flaws.