The invention relates generally to random number generation, and more particularly to techniques for generating unbiased bit streams from a sequence of readings taken from a possibly biased random source.
Sequences of random numbers are of considerable importance in cryptography and many other computing applications, such as stochastic simulation, search heuristics, and game playing. Because truly random numbers are a scarce resource, it is common practice to derive such sequences from pseudo-random number generators. A pseudo-random number generator is a device which takes a truly random input and xe2x80x9cstretchesxe2x80x9d it to produce a long sequence of numbers bearing an appearance of randomness. Although there is a large body of literature on the design and properties of pseudo-random number generators, much less attention has been devoted to the physical (generation and processing of the random input seeds that fuel these generators. It is common practice in the literature to obtain a random seed by invoking a so-called xe2x80x9cuniform random source.xe2x80x9d Practitioners have called into service a variety of physical sources of randomness. These include system clocks, radioactive sources, quantum mechanical effects in semiconductor devices, hard disk timings, and keyboard and mouse timings. Timings of human interaction with a keyboard or mouse are currently the most common source of random seeds for cryptographic applications on personal computers. After a sufficient amount of such timing data is gathered, it is generally hashed down to a 128-bit or 160-bit seed. However, this method relies for its security guarantees on unproven or unprovable assumptions about the entropy generated by human users and the robustness of hash functions as entropy extractors.
Many of these sources of randomness, such as radioactive sources or hard disk timings, yield data from probability distributions that are stationary. In other words, the output distribution of these sources does not change over time. Even if a source is stationary, though, it generally has a bias. In other words, the source does not give unbiased bits as direct output. Many applications, especially in cryptography, rely on sequences of unbiased bits. It is therefore quite important to be able to extract unbiased bits efficiently from a stationary source with unknown bias.
Suppose that a reading obtained from a stationary source of randomness D can be equal to any one of m different values, but that the probability of obtaining any one of these values is unknown. Such a source of randomness can be thought of as a die D with m sides, i.e., taking readings from the random source is like rolling the die D. The m sides of D are not necessarily equally probable. In other words, the die D may be biased. A number of techniques have been developed which attempt to obtain unbiased random bits from biased sources of randomness. Such techniques are described in, for example, J. von Neumann, xe2x80x9cVarious Techniques used in Connection with Random Digits,xe2x80x9d In National Bureau of Standards, Applied Math Series, Vol. 12, pp. 36-38, 1951, Notes by G. E. Forsythe, Reprinted in Neumann""s Collected Works, Vol. 5, Pergamon Press, 1963; P. Elias, xe2x80x9cThe Efficient Construction of an Unbiased Random Sequence,xe2x80x9d Ann. Math. Statist., 43(3):865-870, 1972; M. Blum, xe2x80x9cIndependent Unbiased Coin Flips from a Correlated Biased Source: A Finite State Markov Chain, 25th IEEE Symposium on Foundations of Computer Science, pp. 425-433, 1984; M. Santha and U. Vazirani, xe2x80x9cGenerating Quasi-Random Sequences from Slightly-Random Sources,xe2x80x9d 25th IEEE Symposium on Foundations of Computer Science, pp. 434-440, 1984; and D. Feldmann et al., xe2x80x9cOn Dice and Coins: Models of Computation for Random Generation,xe2x80x9d Information and Computation, 104(2):159-174, June 1993. Unfortunately, these and other conventional techniques fail to provide adequate practical solutions to the problem of efficiently extracting unbiased random bits from biased physical sources of randomness.
The invention provides efficient techniques for extracting unbiased bits from potentially biased physical sources of randomness. In an illustrative embodiment, a stationary physical source of randomness, such as a source which can be represented as a biased die having an unknown bias, may be used to generate unbiased random bits. A simulated unbiased source is generated using readings from the potentially biased source, and a reading is taken from the simulated unbiased source. The reading from the simulated unbiased source is then converted to a bit string. Taking a reading from the simulated unbiased source may involve, for example, generating an integer pair (R,S), which depends on the sequence of readings from the random source. The integer pair (R,S) represents a roll of value R on a simulated unbiased source corresponding to an unbiased die U with S sides. The pair (R,S) is then converted into an output bit string bkbkxe2x88x921 . . . b1 which is unbiased over sequences of readings from the random source.
In accordance with another aspect of the invention, the integer pair (R,S) may be generated by first generating an ordered list of all S possible permutations of the elements of the sequence of readings taken from the random source, and then selecting from the ordered list an Rth element which corresponds to the sequence of readings taken from the random source. In another possible implementation, the integer pair (R,S) may be generated by simulating a roll on each of a number of sources represented as unbiased dice, using information in the sequence of readings taken from the random source, and combining the rolls for these sources to simulate the roll R on a source which may be represented as an unbiased die U with S sides. The reading from the simulated unbiased source may be converted to a bit string by partitioning the sides {1, 2, . . . , S} of U into sets A1, A2, . . . , Aj such that the set sizes |A1|, |A2|, . . . , |Aj| are unique, decreasing powers of two, and then assigning a mapping from elements of each set to a corresponding set of bit strings. In another possible implementation, the reading from the simulated unbiased source may be converted to a bit string by, for example, converting the values Rxe2x88x921 and S into k-bit integers, comparing the resulting bit strings rkrkxe2x88x921 . . . r1 and skskxe2x88x921 . . . s1 to locate a pair of bits rj and sj such that sj=1 and rj=0, and outputting the bits rjxe2x88x921rjxe2x88x922 . . . r1.
The techniques of the invention can be shown to be optimally efficient in terms of output entropy, and are computationally practical to implement. More particularly, an extractor in accordance with the invention can be configured to output a maximal expected number of bits for a given number of readings, but, unlike certain conventional techniques, has no minimal number of readings required before it can produce output. The invention is therefore particularly well suited for use in extracting useful randomness in the form of bit strings from any stationary physical source.