Historically, many computer software applications require a supply of random numbers. For example, Monte Carlo simulations of physical phenomena, such as large-scale weather simulations, require a supply of random numbers in order to simulate physical phenomenon. Other examples of applications requiring random numbers are casino games and on-line gambling to simulate card shuffling, dice rolling, etc.; lottery number creation; the generation of data for statistical analysis, such as for psychological testing; and use in computer games.
The quality of randomness needed, as well as the performance requirements for generating random numbers, differs among these types of applications. Many applications such as computer games have trivial demands on quality of randomness. Applications such as psychological testing have more stringent demands on quality, but the performance requirements are relatively low. Large-scale Monte Carlo-based simulations, however, have very high performance requirements and require good statistical properties of the random numbers, although non-predictability is not particularly important. Other applications, such as on-line gambling, have very stringent randomness requirements as well as stringent non-predictability requirements.
While these historical applications are still important, computer security generates the greatest need of high-quality random numbers. The recent explosive growth of PC networking and Internet-based commerce has significantly increased the need for a variety of security mechanisms.
High-quality random numbers are essential to all major components of computer security, which are confidentiality, authentication, and integrity.
Data encryption is the primary mechanism for providing confidentiality. Many different encryption algorithms exist, such as symmetric, public-key, and one-time pad, but all share the critical characteristic that the encryption/decryption key must not be easily predictable. The cryptographic strength of an encryption system is essentially the strength of the key, i.e., how hard it is to predict, guess, or calculate the decryption key. The best keys are long truly random numbers, and random number generators are used as the basis of cryptographic keys in all serious security applications.
Many successful attacks against cryptographic algorithms have focused not on the encryption algorithm but instead on its source of random numbers. As a well-known example, an early version of Netscape's Secure Sockets Layer (SSL) collected data from the system clock and process ID table to create a seed for a software pseudo-random number generator. The resulting random number was used to create a symmetric key for encrypting session data. Two graduate students broke this mechanism by developing a procedure for accurately guessing the random number to guess the session key in less than a minute.
Similar to decryption keys, the strength of passwords used to authenticate users for access to information is effectively how hard it is to predict or guess the password. The best passwords are long truly random numbers. In addition, in authentication protocols that use a challenge protocol, the critical factor is for the challenge to be unpredictable by the authenticating component. Random numbers are used to generate the authentication challenge.
Digital signatures and message digests are used to guarantee the integrity of communications over a network. Random numbers are used in most digital signature algorithms to make it difficult for a malicious party to forge the signature. The quality of the random number directly affects the strength of the signature. In summary, good security requires good random numbers.
Numbers by themselves are not random. The definition of randomness must include not only the characteristics of the numbers generated, but also the characteristics of the generator that produces the numbers. Software-based random number generators are common and are sufficient for many applications. However, for some applications software generators are not sufficient. These applications require hardware generators that generate numbers with the same characteristics of numbers generated by a random physical process. The important characteristics are the degree to which the numbers produced have a non-biased statistical distribution, are unpredictable, and are irreproducible.
Having a non-biased statistical distribution means that all values have equal probability of occurring, regardless of the sample size. Almost all applications require a good statistical distribution of their random numbers, and high-quality software random number generators can usually meet this requirement. A generator that meets only the non-biased statistical distribution requirement is called a pseudo-random number generator.
Unpredictability refers to the fact that the probability of correctly guessing the next bit of a sequence of bits should be exactly one-half, regardless of the values of the previous bits generated. Some applications do not require the unpredictability characteristic; however, it is critical to random number uses in security applications. If a software generator is used, meeting the unpredictability requirement effectively requires the software algorithm and its initial values be hidden. From a security viewpoint, a hidden algorithm approach is very weak. Examples of security breaks of software applications using a predictable hidden algorithm random number generator are well known. A generator that meets both the first two requirements is called a cryptographically secure pseudo-random number generator.
In order for a generator to be irreproducible, two of the same generators, given the same starting conditions, must produce different outputs. Software algorithms do not meet this requirement. Only a hardware generator based on random physical processes can generate values that meet the stringent irreproducibility requirement for security. A generator that meets all three requirements is called a truly random number generator.
Software algorithms are used to generate most random numbers for computer applications. These are called pseudo-random number generators because the characteristics of these generators cannot meet the unpredictability and irreproducibility requirements. Furthermore, some do not meet the non-biased statistical distribution requirements.
Typically, software generators start with an initial value, or seed, sometimes supplied by the user. Arithmetic operations are performed on the initial seed to produce a first random result, which is then used as the seed to produce a second result, and so forth. Software generators are necessarily cyclical. Ultimately, they repeat the same sequence of output. Guessing the seed is equivalent to being able to predict the entire sequence of numbers produced. The irreproducibility is only as good as the secrecy of the algorithm and initial seed, which may be an undesirable characteristic for security applications. Furthermore, software algorithms are reproducible because they produce the same results starting with the same input. Finally, software algorithms do not necessarily generate every possible value within the range of the output data size, which may reflect poorly in the non-biased statistical distribution requirement.
A form of random number generator that is a hybrid of software generators and true hardware generators are entropy generators. Entropy is another term for unpredictability. The more unpredictable the numbers produced by a generator, the more entropy it has. Entropy generators apply software algorithms to a seed generated by a physical phenomenon. For example, a highly used PC encryption program obtains its seed by recording characteristics of mouse movements and keyboard keystrokes for several seconds. These activities may or may not generate poor entropy numbers, and usually require some user involvement. The most undesirable characteristic of most entropy generators is that they are very slow to obtain sufficient entropy.
It should be clear from the foregoing that certain applications, including security applications, require truly random numbers which can only be generated by a random physical process, such as the thermal noise across a semiconductor diode or resistor, the frequency instability of a free-running oscillator, or the amount a semiconductor capacitor is charged during a particular time period. One solution to providing an inexpensive, high-performance hardware random number generator would be to incorporate it within a microprocessor. The random number generator could utilize random physical process sources such as those discussed above, and would be relatively inexpensive, since it would be incorporated into an already existing semiconductor die.
Some applications that use random numbers cannot tolerate long contiguous strings of zero or one bits. For example, the Federal Information Processing Standards (FIPS) Publication 140-2 imposes a long run test within its set of statistical random number generator tests. A random number generator fails the long run test if it generates more than 26 contiguous zeroes or ones. However, since a string of 26 contiguous like bits is a distinct random possibility, a truly random number generator will by definition fail such a test if tested for a sufficiently long period of time.
Most statistical tests, such as the FIPS 140-2 long run test, are targeted at pseudo-random number generators. The pseudo-random number generators filter out the long contiguous strings of like bits in software. However, as discussed above, for some applications, pseudo-random generators are unacceptable. These applications require hardware random number generators. Therefore, what is needed is an apparatus and method for ensuring that no contiguous strings of zero or one bits longer than certain lengths are output by a hardware random number generator.