Historically, many computer software applications require a supply of random numbers. For example, Monte Carlo simulations of physical phenomena, such as large-scale weather simulations, require a supply of random numbers in order to simulate physical phenomenon. Other examples of applications requiring random numbers are casino games and on-line gambling to simulate card shuffling, dice rolling, etc.; lottery number creation; the generation of data for statistical analysis, such as for psychological testing; and use in computer games.
The quality of randomness needed, as well as the performance requirements for generating random numbers, differs among these types of applications. Many applications such as computer games have trivial demands on quality of randomness. Applications such as psychological testing have more stringent demands on quality, but the performance requirements are relatively low. Large-scale Monte Carlo-based simulations, however, have very high performance requirements and require good statistical properties of the random numbers, although non-predictability is not particularly important. Other applications, such as on-line gambling, have very stringent randomness requirements as well as stringent non-predictability requirements.
While these historical applications are still important, computer security generates the greatest need of high-quality random numbers. The recent explosive growth of PC networking and Internet-based commerce has significantly increased the need for a variety of security mechanisms.
High-quality random numbers are essential to all major components of computer security, which are confidentiality, authentication, and integrity.
Data encryption is the primary mechanism for providing confidentiality. Many different encryption algorithms exist, such as symmetric, public-key, and one-time pad, but all share the critical characteristic that the encryption/decryption key must not be easily predictable. The cryptographic strength of an encryption system is essentially the strength of the key, i.e., how hard it is to predict, guess, or calculate the decryption key. The best keys are long truly random numbers, and random number generators are used as the basis of cryptographic keys in all serious security applications.
Many successful attacks against cryptographic algorithms have focused not on the encryption algorithm but instead on its source of random numbers. As a well-known example, an early version of Netscape's Secure Sockets Layer (SSL) collected data from the system clock and process ID table to create a seed for a software pseudo-random number generator. The resulting random number was used to create a symmetric key for encrypting session data. Two graduate students broke this mechanism by developing a procedure for accurately guessing the random number to guess the session key in less than a minute.
Similar to decryption keys, the strength of passwords used to authenticate users for access to information is effectively how hard it is to predict or guess the password. The best passwords are long truly random numbers. In addition, in authentication protocols that use a challenge protocol, the critical factor is for the challenge to be unpredictable by the authenticating component. Random numbers are used to generate the authentication challenge.
Digital signatures and message digests are used to guarantee the integrity of communications over a network. Random numbers are used in most digital signature algorithms to make it difficult for a malicious party to forge the signature. The quality of the random number directly affects the strength of the signature. In summary, good security requires good random numbers.
Numbers by themselves are not random. The definition of randomness must include not only the characteristics of the numbers generated, but also the characteristics of the generator that produces the numbers. Software-based random number generators are common and are sufficient for many applications. However, for some applications software generators are not sufficient. These applications require hardware generators that generate numbers with the same characteristics of numbers generated by a random physical process. The important characteristics are the degree to which the numbers produced have a non-biased statistical distribution, are unpredictable, and are irreproducible.
Having a non-biased statistical distribution means that all values have equal probability of occurring, regardless of the sample size. Almost all applications require a good statistical distribution of their random numbers, and high-quality software random number generators can usually meet this requirement. A generator that meets only the non-biased statistical distribution requirement is called a pseudo-random number generator.
Unpredictability refers to the fact that the probability of correctly guessing the next bit of a sequence of bits should be exactly one-half, regardless of the values of the previous bits generated. Some applications do not require the unpredictability characteristic; however, it is critical to random number uses in security applications. If a software generator is used, meeting the unpredictability requirement effectively requires the software algorithm and its initial values be hidden. From a security viewpoint, a hidden algorithm approach is very weak. Examples of security breaks of software applications using a predictable hidden algorithm random number generator are well known. A generator that meets both the first two requirements is called a cryptographically secure pseudo-random number generator.
In order for a generator to be irreproducible, two of the same generators, given the same starting conditions, must produce different outputs. Software algorithms do not meet this requirement. Only a hardware generator based on random physical processes can generate values that meet the stringent irreproducibility requirement for security. A generator that meets all three requirements is called a truly random number generator.
Software algorithms are used to generate most random numbers for computer applications. These are called pseudo-random number generators because the characteristics of these generators cannot meet the unpredictability and irreproducibility requirements. Furthermore, some do not meet the non-biased statistical distribution requirements.
Typically, software generators start with an initial value, or seed, sometimes supplied by the user. Arithmetic operations are performed on the initial seed to produce a first random result, which is then used as the seed to produce a second result, and so forth. Software generators are necessarily cyclical. Ultimately, they repeat the same sequence of output. Guessing the seed is equivalent to being able to predict the entire sequence of numbers produced. The irreproducibility is only as good as the secrecy of the algorithm and initial seed, which may be an undesirable characteristic for security applications. Furthermore, software algorithms are reproducible because they produce the same results starting with the same input. Finally, software algorithms do not necessarily generate every possible value within the range of the output data size, which may reflect poorly in the non-biased statistical distribution requirement.
A form of random number generator that is a hybrid of software generators and true hardware generators are entropy generators. Entropy is another term for unpredictability. The more unpredictable the numbers produced by a generator, the more entropy it has. Entropy generators apply software algorithms to a seed generated by a physical phenomenon. For example, a highly used PC encryption program obtains its seed by recording characteristics of mouse movements and keyboard keystrokes for several seconds. These activities may or may not generate poor entropy numbers, and usually require some user involvement. The most undesirable characteristic of most entropy generators is that they are very slow to obtain sufficient entropy.
It should be clear from the foregoing that certain applications, including security applications, require truly random numbers which can only be generated by a random physical process, such as the thermal noise across a semiconductor diode or resistor, the frequency instability of a free-running oscillator, or the amount a semiconductor capacitor is charged during a particular time period. Hardware random number generators employ the random physical processes such as these to generate numbers with desirable random number characteristics, if not truly random characteristics.
Another advantage of hardware random number generators, in addition to their desirable random number characteristics, is that they can typically produce random numbers at a faster rate than software random number generators or entropy generators. In fact, a hardware random number generator may be capable of generating random numbers at an overall faster rate than the software application can consume them. However, many applications tend not to demand the random numbers at a constant rate. Instead, the applications tend to request a relatively large number of random data bytes in a chunk, use the large chunk, and then request another large chunk after a relatively large amount of time. For example, an encryption program might request 16 random data bytes, then use the 16 bytes to perform encryption for a relatively long time, and then ask for another 16 random data bytes. In addition, the operating system may perform a task switch and allow itself or another application to run on the processor for a while, during which time the software application consuming the random data bytes is not demanding random data. If the hardware random number generator is not generating random data while the application is not requesting it, then the next time the application requests more random data bytes the application will have to wait for the generator to generate the bytes. This detrimentally affects the performance of the application.
Therefore, what is needed is a hardware random number generator that asynchronously generates random data and buffers the random data in such a way as to provide good performance characteristics.