The invention pertains to the generation of cryptographically strong random numbers which can be used by a computer system for file encryption, zero knowledge proofs, and other processes which require random numbers.
Much of today""s computer based cryptography requires cryptographically strong random and/or pseudo-random number input (collectively referred to herein as random number input). The cryptographically strong random number input serves as a basis for the generation of cryptographically strong encryption keys. In some systems, a random number will form an encryption key xe2x80x9cas isxe2x80x9d (i.e., the random number will be used as an encryption key without modification). In other systems, a random number will be used to xe2x80x9cseedxe2x80x9d an encryption key generation process.
A common form of cryptography which often requires a random number input is public key cryptography. In a form of public key cryptography referred to as RSA (the Rivest, Shamir, and Adelman approach), three elements are requiredxe2x80x94a public encryption key, a private encryption key, and a modulus. The public encryption key is simply an arbitrary constant, which is usually recommended to be 3 or 65,537. The private encryption key is computed as a mathematical function of two randomly generated large prime numbers (i.e., two random number inputs) and the public encryption key. The modulus is merely the product of the two large prime numbers. To insure that a user can both send and receive encrypted data, the public encryption key and the modulus are publicly distributed over a network. The private encryption key is not distributed over the network, and is preserved in secrecy by its owner.
To send and receive encrypted data over a network, RSA public key cryptography requires the generation of public and private encryption keys for each user desiring to send and/or receive encrypted data over the network. When User A wants to send encrypted data to User B, User A encrypts their data using User B""s public encryption key and modulus. Upon receiving User A""s data, User B can decrypt User A""s data using their private encryption key and modulus. In other words, User B uses User B""s own private encryption key and modulus to decrypt data which User A encrypts using User B""s public encryption key and modulus. The success of RSA public key cryptography is based on three principles: 1) if the randomly generated large prime numbers are truly large and truly random, it is extremely difficult for an attacker to factor the user""s publicly distributed modulus, and as a result, it is extremely difficult for an attacker to determine the identity of the two prime numbersxe2x80x94a message sent to a specific user can only be decrypted if the two prime numbers factored into the specific user""s modulus are known; 2) knowing the algorithm which is used to generate a user""s modulus (i.e., multiplication) does not assist an attacker in factoring the user""s modulus; and 3) users"" private encryption keys can really be kept private since there is no need to transfer copies of these keys from one user to another.
The success of RSA public key cryptography therefore depends in part on the generation of two truly random large prime numbers. Many other forms of cryptography also depend on true random number generation. Unfortunately, random numbers are hard to come by in a computer, since by design, computers are very deterministic. If cryptography systems requiring truly random numbers are not provided with truly random numbers, it becomes much more likely that an attacker might be able to decipher encrypted data.
The problem of random number generation is compounded by the fact that many cryptography systems encrypt many different data packets, each of which, or only a small number of which, share the same random number input. Such systems therefore require a high volume of random number inputs, all of which have no relation to one another and are therefore truly random.
In addition to public key cryptography, other situations arise in which cryptographically strong random numbers are needed. For example, one situation is in zero knowledge proofs. A zero knowledge proof is a situation in which one party (a person or computer program) needs to prove to another party that they both share a secret (e.g., a password, access to a social security number, an account number, etc.). If the second party cannot prove to the first party that it already knows the secret, then the first party will not want to disclose and/or discuss the secret with the second party. Random numbers are often required as part of a zero knowledge proof. See, for example, the discussions of zero knowledge proofs found in xe2x80x9cStrong Password-Only Authenticated Key Exchangexe2x80x9d by David P. Jablon, www.IntegritySciences.com, Mar. 2, 1997; xe2x80x9cZero Knowledge Proofs of Identityxe2x80x9d by U. Feige, A. Fiat, and A. Shamir, Proceedings of the 19th ACM Symposium on Theory of Computing, pp. 210-217 (1987); and xe2x80x9cStrongbox: A System for Self Securing Programsxe2x80x9d by J. D. Tygar and B. S. Yee, CMU Computer Science: 25th Anniversary Commemorative (1991); and xe2x80x9cMultiple non-interactive zero knowledge proofs based on a single random stringxe2x80x9d by U. Feige, D. Lapidot, and A. Shamir, 31st Annual Symposium on Foundations of Computer Science, Vol. 1, pp. 308-317 (October 1990). One method of conducting zero knowledge proofs is via Simple Password-authenticated Exponential Key Exchange (SPEKE). SPEKE is an attractive protocol for authentications conducted between wireless hand-held devices, set top boxes, diskless work stations, smart cards, etc.
Several solutions to the problem of true random number generation have been posed. A first of these solutions is used in versions of PGP, a software-based cryptography application sold by Network Associates of Santa Clara, Calif. PGP keeps a file comprising random numbers which are produced in response to the time intervals between keystrokes which a user makes as they input data into to the PGP program. One disadvantage of this system is that the random number file is kept in a user""s file system. The file can therefore be read by a savvy attacker, and/or the values in the file could be forced to values which are supplied by the attacker. In either case, once the attacker knows the values which are queued to be used by the PGP program, it becomes much easier for the attacker to decipher data which is encrypted by the PGP program. Another disadvantage of the PGP program is that only a small amount of random number data is sometimes available since 1) a user is usually not required to input a lot of data into the PGP program, and few keystrokes are therefore made, and 2) depending on the computer and the operating system which the user is using, the time intervals between the user""s keystrokes might be quantized to the point where only a few bits of random information are generated in response to each keystroke interval (e.g., four bits of data per keystroke interval). Furthermore, when a user makes numerous keystroke in succession, the speed at which the user makes successive keystrokes will often be quite uniform. An attacker""s knowledge of any of the following can therefore be used to zero in on the range of random number data available to the PGP program: a user""s typing speed; a user""s computer type and speed; and a user""s operating system.
Another problem with using user keystrokes for the generation of random number data is that some devices which need to generate random number data might never (or rarely) receive direct user input. For example, consider a server which runs entirely automated computer programs, and which is accessed by an operator via a keyboard only rarely. Or consider a disk which is directly connected to a network (e.g., a Network Attached Storage (NAS) device). In each of these cases, nearly all accesses to the device are initiated over a network (e.g., the Internet), and direct keyboard input to the device is rarely received. Such devices benefit much less from the PGP program""s methods of generating random number data.
A second method which has been posed as a solution to the problem of true random number generation involves timing the time it takes to access a disk. A disk access is dependent on the location of the data being accessed, the rotational speed of the disk, and the time it takes to move a disk""s heads. All of these factors can vary significantly from one disk access to another, and the time it takes to access a disk is therefore xe2x80x9cin theoryxe2x80x9d a good source of random data. In actuality, however, the amount of random data which can be derived from a single disk access is limited by the resolution with which disk access times can be measured on a given system. Quite often, this resolution is low (e.g., eight bits). Furthermore, a disk access is relatively slow in comparison to the speed at which a computer can execute software which has already been loaded into the computer""s cache and RAM memories. Computers on which a lot of files are encrypted therefore suffer performance penalties when they must wait on random number input generated in response to relatively slow disk accesses. A solution to this speed problem is the maintenance of a random number file. This, however, leads to the same data security problem posed by the PGP software program (i.e., since the file is stored in a user""s file system, an attacker might be able to access the file, and/or seed the file with known values). Yet another problem with timing disk accesses is that absent a real need to access a disk, encryption software will have to request a disk access for the sole purpose of timing the disk access, and not for the purpose of retrieving substantive data from the disk. Such xe2x80x9cdummyxe2x80x9d accesses increase the mechanical wear on a disk and introduce system delays which might delay another user""s legitimate access to the same disk.
A third method which has been posed as a solution to the problem of true random number generation involves adding to a microprocessor a circuit for amplifying and digitizing the thermal noise produced by the microprocessor. The problem with such a circuit is that it is very hard to verify that it is working properly (i.e., it is hard to test). Since the output of the circuit is intended to be random, the circuit can only be tested by acquiring significant amounts of data from the circuit, and then subjecting the data to complex analysis to determine if the data is in fact random. Another problem stems from the fact that implementing such a circuit as part of a microprocessor""s hardware makes it impossible to run two microprocessors in xe2x80x9clock-stepxe2x80x9d. The ability to run two microprocessors in lock-step is an important feature, as it enables one to run the same software program on two microprocessors and compare the state of; each microprocessor after execution of each instruction in the software program. If the states of the two microprocessors differ after the execution of an instruction, one can assume that one of the microprocessors is faulty. Since slight manufacturing variances in two supposedly identical microprocessors can alter the thermal noise produced by each, the states of the microprocessor""s thermal noise circuits will obviously differ. It is therefore impossible to run microprocessors with thermal noise circuits in lock-step.
Other methods for generating truly random numbers tend to generate data which is progressively more random, but generate the data at a greater cost. For example, one researcher has reported that he connected a digital video camera to his computer and pointed it at three lava lights. He then took all of the bits from the digital image generated by the video camera and put them through a hashing function to generate a string of random bits. While the random numbers generated from this string of random bits might be cryptographically strong, it is not practical for most computer users to implement such a system. Radios and microphones coupled to analog-to-digital converters have also been proposed as sources of random data. However, these devices suffer from the same problems as digital video cameras. They can be expensive, and they do not come standard on all computer systems.
The ultimate source of random data comes from measuring the arrival times of byproducts of radioactive decay. The decay can be background decay (i.e., naturally occurring decay which occurs in the environment around us, but at extremely small levels which require very sensitive detectors to measure) or decay of a radioactive source in a measurement instrument.
Due to the shortcomings and disadvantages of the above methods for generating random numbers, a need exists for a truly random number generator which: 1) generates truly random numbers, 2) is testable, 3) is cost-effective, and/or 4) is very difficult for an attacker to seed with known values and/or reset.
To fulfill some or all of the above needs, the inventor has devised new methods and apparatus for generating cryptographically strong random numbers (e.g., encryption keys). The methods and apparatus obtain random numbers, and/or obtain seed values from which random numbers can be generated, from built-in self-test (BIST) registers. More specifically, the methods and apparatus disclosed herein utilize data maintained by one or more multiple input shift registers (MISRs) as a basis for generating cryptographically strong random numbers.
The MISRs are configured for sampling data which appears on instruction, data and address buses of a microprocessor. Since the data which appears on these buses is dependent upon cache accesses, disk accesses, user input (e.g., keystrokes), network traffic, and other data operations which influence the course of instructions executed, and data consumed, by a microprocessor, the data maintained by such MISRs has a high degree of randomness. The feedback associated with the MISRs causes their data to depend on both recent and historical cache accesses, disk accesses, keystrokes, network traffic, and so on, thus increasing the randomness of their data. Since today""s microprocessors perform thousands of operations a second, the random data stored in these MISRs changes at a high rate of speed, and a large quantity of random data is generated.
If the data stored in the afore-mentioned MISRs can only be read by an instruction of the highest privilege (e.g., an instruction which can only be issued by a computer""s operating system), then attacks on the MISRs"" data are unlikely to be successful. The security of MISR data is further increased if the MISRs only sample buses which run wholly within an integrated circuit package (i.e., buses which do not present themselves at an integrated circuit""s output pins, contacts, etc.).
Due to the high degree of randomness and secure storage of the MISRs"" data, encryption keys generated therefrom are cryptographically strong.
It is also significant to note that the data stored in the MISRs is essentially updated at the clock speed of the buses to which they are attached, and by the time a computer has booted up, the data stored in the MISRs has already been updated in response to thousands of variables. Random MISR data is therefore available whenever it is needed, and is available much quicker than random data generated in response to a yet unexecuted disk access.
MISRs are especially useful for generating a random number within a device which receives little or no keyboard input (e.g., a server or a NAS device).
Another advantage of generating random numbers in response to MISR data is that the hardware needed to implement and service a MISR is relatively small and cheap. Furthermore, MISRs which sample the instruction, data and address buses of a microprocessor can also be used as components of BIST hardware. As a result, their use for random number generation is essentially provided at no additional cost, and their testability is insured.