The evolution of electronic commerce and other network computing paradigms, such as the Internet, have increased the need for self-contained software entities. Such entities include intelligent agents, content containers, and network crawlers that are frequently required to authenticate themselves to other software entities, such as firewalls. This authentication must be performed without human intervention.
Authentication can be made by using one or more of the following bases (1) something you have; (2) something you are; or (3) something you know. Only the third basis is available to independent software entities. Inherent in authentication based on something you know is that the something you know is a secret. However, it is a known problem to maintain a secret in software.
One attack on secrets stored in independent software entities is disclosed in "Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems", by Paul C. Kocher, published in "Advances in Cryptology", CRYPTO '96, p. 104, Lecture Notes in Computer Science #1109, which describes an autocorrelation timing attack using a simple a series of computations method that may be tailored to work with any implementation of ciphers with sufficient timing variance in their implementations. The attack is effective because most cryptographic operations contain code segments that execute differently depending on the value of the segment of cryptographic key used in an operation. For example, public key computations involve long series of modular exponentiations. The standard methods of doing the modular exponentiations utilize power trees that have different operations depending on whether the current bit is a 1 or a 0. It may appear that only a small amount of information, such as the Hamming weight of the cryptographic key would be ascertainable, or that noise in the system would overwhelm the information sought, however, Kocher has shown that neither of these assumptions is accurate.
The autocorrelation timing attack may be explained by showing how timing information can be used to derive the individual secret component, X, of a Diffie-Hellman key exchange. The Diffie-Hellman key exchange is used to allow two parties, Alice and Bob, to mutually agree upon a secret, Z. The two participating parties agree on g and p, both of which may be known to an adversary. Alice generates a personal secret X.sub.A and computes EQU Y.sub.A =g.sup.X.sbsp.A mod p. Equation 1
Bob generates a personal secret X.sub.B and computes EQU Y.sub.B =g.sup.X.sbsp.B mod p. Equation 2
Alice and Bob then publicly exchange Y.sub.A and Y.sub.B. Thus, an adversary may know g, p, Y.sub.A and Y.sub.B. Alice then computes EQU Z=Y.sub.B.sup.A.sbsp.A mod p Equation 3
and Bob computes EQU Z=Y.sub.A.sup.X.sbsp.B mod p. Equation 4
The Zs are equal and are thus a shared secret.
An autocorrelation timing attack can be utilized to allow an adversary to derive X.sub.B if Bob uses the same value of X.sub.B in very many exchanges. The attacker first observes k exchanges, measuring the time, t, taken by Bob to compute Equation 2. This can be accomplished by using the following algorithm to compute Z=g.sup.X mod p: ##EQU1## Thus, an adversary who knows bits b.sub.0 . . . b.sub.n-1 of X can compute b.sub.n. To compute all of the bits, the attacker starts with bit b.sub.0 and monitors the amount of time required to evaluate Equation 2.
For a few R.sub.b and g values, the calculations of R.sub.b+1 will be slow, and the attacker may know which calculations are slow by monitoring the length of time required to execute a modular exponention of R.sub.b+1. If the calculation is slow, the bit is set. If there is no relationship between the time required to compute R.sub.b+1 and the total processing time, the bit is clear.
Typically, the timing difference between a clear bit and a set bit are not extreme, but do have enough variation to allow the attack to work. For each g, the attacker can estimate d, a measure of the time expected to complete (R.sub.b)(g.sub.i) mod p. The attacker also knows the total time t to compute g.sup.X mod p. Because the first b exponent bits are known, the attacker knows R.sub.b and can measure c, the amount of time required for the first b iterations of the exponentiation loop.
Given g, t, c and d for one timing measurement, the probability, P, that a bit b is set is: ##EQU2## where ##EQU3## is the standard normal function, ##EQU4## is the mean of x, and ##EQU5## is the standard deviation of x. Thus, the overall probability that a bit is set is: ##EQU6##
For the attack to work, the probabilities do not need to be very large, because incorrect bit guesses destroy future correlations. After a mistake, no new significant correlations are detected. Thus, the attacker can backtrack through the most recent bits and modify them. It is important to note that this timing attack works in the presence of external system timing noise because it is based on autocorrelation of small code segments. As computers increase in design complexity, the effects of debugging and program monitoring facilities greatly increase the execution time and, more importantly, variance. Thus, an autocorrelation timing attack may run in the background on a personal or similar computer to obtain the user's secret key.
The difference between the attacker's computer and the victim's computer is also a source of variance not related to the key. To overcome this noise, as well as the noise of the normal system interruptions, the attacker simply needs more samples. The attacker's computer may even be the same computer system as victim's computer, but timing differences may be due to monitoring or tracing the victim's code by the attacker. The attacker may even own the victim computer hardware, but not the software that is run on the hardware, such as in video-on-demand systems.
While the number of samples is directly proportional to the number of bits in the key, the number of samples must also increase in proportion to the square of the timing noise. For example, if N samples are needed to derive they key in computations having a "signal" variance of V.sub.s, and a "noise" variance, V.sub.n, the expected number of samples, N', needed to derive the key is: EQU N'=N((V.sub.s +V.sub.n)/V.sub.s).sup.2 Equation 10
Thus, increased noise introduces more variance, which is overcome by increasing the number of samples required by the attacker. Therefore, it would be desirable to provide a method and apparatus for storing statistics related to a series of computations using a cryptographic key rather than storing the cryptographic key.