bug in Random
- To: mathgroup at smc.vnet.net
- Subject: [mg47325] bug in Random
- From: ve10 at cornell.edu (Veit Elser)
- Date: Mon, 5 Apr 2004 05:23:08 -0400 (EDT)
- Sender: owner-wri-mathgroup at wolfram.com
There's a flaw in Mathematica's random number generator for large integers that can have serious consequences in some applications. The example below illustrates my point (I'm using version 5.0.1.0): r = Table[Random[Integer, {1, 2^62}], {10^5}]; Union[Table[r[[i]] + r[[i + 24]] - r[[i + 20]], {i, 10^5 - 24}]] {-2147483647, 1, 2147483649, 4611686016279904256, 4611686018427387904, 4611686020574871552} We see that every combination r_i + r_(i+24) - r_(i+20) obtained from the list of 10^5 numbers gives only six outcomes (1-2^31, 1, 1+2^31, 2^62-2^31, 2^62, 2^62+2^31) ! This only happens when the size of the integers generated by Random[] is large. The onset seems to be near the size used in the example, 2^62. There are other linear relations among successive outputs of Random[], this seems to be the simplest. I discovered this flaw when using Mathematica to generate instances of the knapsack (subset sum) problem. The function Random[] was used to generate a sequence of large integers r ("knapsack"), a binary sequence x was chosen as the key, and the sum is given by r.x = s. In cryptographic contexts the knapsack and sum (r and s) are public information and the key (x) is private. The scheme is not secure if an adversary can determine x from r and s. In the 1980s Lagarias and Odlyzko found that when the number of bits in the knapsack integers is sufficiently large in relation to the length of the knapsack, the key could be found by a trick that involves using the LLL algorithm on a suitably defined lattice basis. When I used this method on the instances generated by Mathematica, I found that my knapsacks were highly insecure: LLL could discover x even when the knapsack parameters were such that the probability should have been extremely low. After some experimentation the issue was resolved when it came to light that LLL was discovering the linear relations of the kind described above. The Mathematica documentation on random number generators mentions a cellular automaton algorithm as well as subtract-with-borrow. I suspect that this flaw is not intrinsic to either of these, but rather the result of how large integers are assembled out of 31-bit pieces. It's easy to eliminate the linear relations by reversing the base-10 digits (since this scrambles the base-2 correlations, reverseDigits[x_] := FromDigits[Reverse[IntegerDigits[x]]]) but this is obviously not a solution for cryptographic applications.