Applications calling upon cryptography generally require the storage of a key in an apparatus, for example an electronic apparatus. However, although it is recommended that cryptographic keys of increasingly large size be employed to guarantee the security of information systems, certain apparatuses have only a small memory space for recording a key. By way of illustration, it is commonly recommended that a cryptographic key of at least 2048 bits be employed for the public-key RSA (“Rivest Shamir Adleman”) algorithm, widely used today in electronic commerce. Moreover, when the key to be used is secret, its storage can constitute a security flaw when faced with an attacker.
Hence, to avoid storing a cryptographic key in an electronic apparatus, it has been proposed that a key be generated with the aid of biometric data such as, for example, a fingerprint. Recently, it has been proposed in the article “efficient helper data key extractor on FPGa's” C. Bösch, J. Guajardo, A-R. Sadeghi; J. Shokrollahi and P. Tuyls; CHES 2008; LNCS 5154; p184-197, that a cryptographic key be generated on the basis of a function related to the physical characteristics specific to an electronic component, this type of function commonly being designated by the acronym PUF for “Physical Unclonable Function” signifying “physically non-reproducible function”. However, whether the data arise from biometric measurements or whether they arise from a PUF of an electronic component, these data are noisy. Indeed, for several executions of one and the same processing systematically taking the same reference datum as input, the physical measurements output by this processing are not strictly identical, although closely related. A practical illustration of this principle is a fingerprint sensor which, for a given finger, does not systematically output the same characterization of this finger, on account of the variations of positioning of the finger on the sensor, of its moisture, of its temperature and of other uncontrolled physical parameters. Consequently, these physical measurements may not be used directly as a cryptographic key, since information encrypted with a first measurement datum arising from a first occurrence of a processing of the reference datum could not be decrypted with a second measurement datum arising from a second occurrence of this same processing.
Moreover, the measured data are not, a priori, uniformly distributed. Stated otherwise, even in the absence of noise, the measurement could not constitute a safe cryptographic key.
These two problems, namely the non-uniform character of the measured data and the presence of noise—therefore the instability of these data—have been dealt with by Dodis et al in the article entitled “Fuzzy Extractors: How to generate strong keys from biometrics and other noisy data” presented at EUROCRYPT 2004: LNCS, vol. 3027, pages 523-540 Springer (latest version SIAM J. Comput., vol 38 issue 1 p 97-139, 2008). To render the data stable despite the presence of noise, the authors of this article propose the use of a fuzzy extractor. One possibility for constructing a fuzzy extractor is the use of two primitives. A first module, designated by the expression “secure sketch”, allows the conciliation of the information, that is to say the restoral of a systematically identical value at output for one and the same input datum, and a second module makes it possible to render the output of the fuzzy extractor uniform by applying a randomness extraction function to the said output, previously stabilized. The fuzzy extractor operates in two phases: enrolment and correction. These two phases are found in the “secure sketch” module: enrolment and correction. The enrolment phase can be executed only once; it produces, on the basis of a reference datum, denoted w, arising from a measurement of a confidential datum denoted W provided as input, a public datum, denoted s and sometimes dubbed a “sketch”. Conventionally, the reference datum w may be obtained via a first measurement arising from a processing of a confidential datum W received by a sensor or an electronic component. By way of illustration, the confidential datum W is a fingerprint and the reference datum w is the characteristic datum obtained by a first measurement of this print by a sensor. Only the public datum s is recorded, the reference datum w being confidential. Subsequently, the correction phase is executed each time that one wishes to retrieve the reference datum w. To this end, a noisy datum w′ originating from a measurement arising from a processing of the confidential datum W—for example, a second measurement of the same fingerprint—is combined with the public datum s. The public datum s therefore plays a role of reconstruction of the confidential reference measurement w on the basis of a noisy datum w′. If the noisy datum w′ is too far from the reference datum w, this reference datum w cannot be reconstructed. Such is the case, for example, when the noisy datum w′ is obtained by a measurement of a processing of a different datum X from the confidential datum W.
More precisely, the reconstruction of the reference datum w involves the use of an error correcting code. During the enrolment phase, a correcting code is selected, and then a word, denoted c, of this code is chosen randomly. The reference datum w is thereafter combined with the word c to produce the public datum s. Thereafter, during the correction phase, this public datum s is combined with the noisy datum w′ to produce a word denoted c′, which does not necessarily belong to the selected code. The word c′ is subjected to the correcting code decoding function. The word c′ is then restored as code word c on condition that the datum w′ is sufficiently close to the reference datum w. To reconstruct the reference datum w, it thereafter suffices to combine the code word c thus retrieved with the public datum s. However, this solution has limits, as explained later.
Recall that an error correcting code may be characterized by three parameters, namely respectively the length “n” of the code, the dimension “k” of the code, and the minimum distance “d” between two words of the code, which distance is proportional to the correction capacity t of the code, that is to say to the number of errors that the code is capable of correcting in a word. The noise level undergone by the datum w′ therefore imposes the value of the minimum distance “d”, the distance used generally being the Hamming distance (for fingerprints it is not possible to use the Hamming distance).
The security level of the system increases proportionately to bk, b being the number base in which the words of the datum are expressed. Indeed, to discover the input datum w by brute force, an attacker ought to find, from among the bk possible code words of length n, that one corresponding to the word of the code chosen to code the reference datum w. Hence, it is preferable to select a correcting code, the value of whose parameter k is as high as possible.
In parallel, the length n of the code chosen to code the datum is bounded by the length of the reference datum w. Starting from the known upper bound relation (bound of the singleton) n−k≧d−1, it follows that the dimension k of the code is also bounded above by n−d+1 and therefore bounded above by the value equal to length(w)−d+1.
Having regard to these two conflicting constraints exerted on the parameter k, it becomes difficult, or indeed impossible, to find a correcting code which allows the input datum to be reconstructed on the basis of a noisy datum while guaranteeing a good level of protection of this input datum, a fortiori when the value of d must be high on account of a high noise level. Sometimes, even when a correcting code meets all these requirements, its execution is inconceivable on an electronic component since it is too complex.