(1) Field of the Invention
The present invention relates to the generation of random numbers in a computer apparatus, suitable for example for use in secure cryptographic applications.
(2) Description of Related Art
Random numbers play an important role in secure cryptographic applications. At the heart of all cryptographic applications is the generation of a random number which is not known and is unguessable by an adversary. The random number must be suitably random that an adversary has a low probability of breaching the cryptographic system in a reasonable time by a systematic trial and error technique. In general terms, the required properties of the generated random number is that the distribution of numbers generated are uniform, unbiased, consistent, unpredictable, scalable and independent.
Current techniques for random number generation are summarised in the document Request for Comment No. 4086 of the Internet Society (June 2005). The basis of a random number generator is an entropy source and a deskewing process.
The entropy source is a source of varying numbers, which may be single bits, derived from the measurement of a physical phenomenon having some degree of randomness. In general, the varying numbers output from any given entropy source may be to some extent biased or skewed (so the distribution is not uniform) and/or correlated (so successive numbers are dependent on each other). A large number of entropy sources have been proposed for use in random number generation in a computer apparatus. Some examples of such known entropy sources are:                an analog sound or video input which digitises a real-world analog source and derives randomness from noise such as thermal noise;        the disk access time of a hard disk drive which is subject to randomness in the rotational speed due to chaotic air turbulence;        the timing of a free-running ring oscillator in which randomness is derived from noise in the components; and        the timing and special value of an event external to the computer apparatus, such as the movement of a mouse, the interrupt timing of a mechanical input/output device, keystroke and similar user events.        
In general terms, it is clear that hardware-based entropy sources are better than other entropy sources. However, the degree of randomness available from any entropy source is limited. As mentioned above the random numbers derived directly from any entropy source has some degree of bias and/or correlation. This has the result that the output of the entropy source may have a non-uniform probability distribution or there may be some degree of correlation between successive outputs.
To deal with this, a deskewing process is performed on the output of the entropy source. The deskewing process reduces the bias and correlation in the output of the entropy source. Ideally, the deskewing process produces a uniform probability distribution with no correlation. A large number of deskewing processes are known, some examples being as follows:                calculating the parity using an exclusive-OR function;        applying a transition mapping such as a von Neumann mapping;        performing a frequency transformation, for example a Fast Fourier Transform; and        applying a reversible compression technique.        
In general terms, the requirements of the random number generator are that it provides random numbers with a degree of randomness sufficiently high for the application concerned using an entropy source which is readily available in a computer apparatus. In general, any degree of randomness may be achieved from any given entropy source by taking a large enough number of measurements and applying a deskewing process to those measurements. For example, consider an entropy source which has a probability of 0.99 of outputting a value 1 and a probability of 0.01 of outputting a value 0. Merely using the simple deskewing process of performing an exclusive-OR operation, it is possible to obtain an output random bit which is within 0.1% of an equal probability of having a value 0 or 1 by applying the deskewing process to just over 300 measurements.
However, in practice there is a limitation that the random numbers should be generated in a time which is reasonable for the application concerned. Thus, the speed of generation of the random numbers is dependent not only on the nature of the entropy source and the rate at which measurements can be made, but also on the degree of randomness in the output of the entropy source. This is because the randomness affects the number of measurements needed to be combined in the deskewing process to produce an output random number with a sufficiently uniform distribution for the application concerned.
As previously mentioned, there are many choices of entropy source, but the present invention is concerned with use of the disk access time of a hard disk drive. Randomness in the disk access time arises as follows.
Reference is made to FIG. 1 which shows a typical hard disk drive 1 comprising a rotatable platter 2 (or in general any number of platters) readable by a magnetic head 3 supported on the end of an actuator arm 4. During operation, the head 3 accesses different sectors on the platter 2 by means of (a) rotation of the platter 2 as shown by the arrow A so that the head 3 reads successive sectors of an annular track and (b) movement of the actuator arm 4 causing movement of the head 3 along an arc (roughly radially) inwardly and outwardly from the centre of the platter 2 as shown by arrow B. The movement of the actuator arm 4 to move the head 3 to different tracks is known as seeking and the time taken for the head 3 to move from one track to another is called the seek time.
The additional time needed for the platter 2 to rotate until the head 3 is at the sector with the desired address is known as the rotational delay or rotational latency. Due to air turbulence in the gap between the head 3 and the platter 2, the speed of rotation of the platter 2 is subject to a degree of randomness. That is to say, the air turbulence is chaotic and causes the rotational speed to vary. Thus disk accesses involving the same movement of the platter 2 and head 3 take different times. In principle, this makes the disk access time a good entropy source.
However, in practice, use of the disk access time of the hard disk time has the practical limitation of being relatively slow. In general terms, it would be desirable to improve the speed of generation of random numbers using the disk access time of a hard disk drive as an entropy source.
Another problem concerns the choice of address for the disk access. As the degree of randomness arises from the rotation of the platter 2, a disk accesses which maximise the rotational latency should be chosen. However, in practice it is difficult to determine disk accesses which achieve this. For different types of hard disk drive 1, the layout of sectors on the platter 2 and the speed of rotation of the platter 2 is not standardised and varies. Thus disk accesses appropriate on one type of hard disk drive 1 will be inappropriate on another type of hard disk drive 1.
In general, random number generator will not have the prior knowledge of the nature of the hard disk drive 1 in any given computer apparatus. This makes it difficult to select an appropriate address for the disk access. If an inappropriate disk access is chosen this may reduce the amount of rotation and hence the degree of randomness present in the time measurement. Ultimately this impacts on the overall speed of generation of the random numbers as it increases the number of disk access time measurements needed to derive random numbers with a sufficiently uniform probability distribution.
Another problem associated with the use of the disk access time of a hard disk drive is the fact that hard disk drives commonly employ cache memories incorporated in the hard disk drive 1 to cache data stored on the platter 2. On receiving a command to read data, if the data is available in the cache memory then the hard disk drive 1 retrieves the data from the cache memory instead of performing a physical access to the platter 2. This is known as a cache seek and reduces the disk access time. This is highly advantageous in speeding up operation of the hard disk drive 1 during normal operation.
On the other hand, a cache memory creates a problem when the disk access time of the hard disk drive is used as an entropy source for random number generation because a cache seek is not a good entropy source as the time taken is not sufficiently random. In principle, this problem can be tackled by rejecting disk access time measurements which are sufficiently short to be indicative of a retrieval of data from the cache rather from the platter 2 and/or by initially filling the cache memory (eg by making preliminary accesses before measuring the disk access time). However, both of these techniques effectively slow the rate at which useful disk access time measurements can be taken and hence reduce the speed of random number generation. It would be desirable to limit this reduction in the speed of random number generation arising from the presence of a cache in a hard disk drive.