Conventional Password Based Security Systems
Conventional password based security systems typically include two phases. Specifically, during an enrollment phase, users select passwords, which are stored on an authentication device, such as server. To gain access to resources or data during an authentication phase, the users enter their passwords, which are verified against the stored versions of the passwords. If the passwords are stored as plain text, then an adversary who gains access to the system could obtain every password. Thus, even a single successful attack can compromise the security of the entire system.
As shown in FIG. 1, a conventional password based security system 100 stores 115 encrypted 110 passwords 101 in a password database 120 during an enrollment phase 10. Specifically, if X is password 101 to be stored 115, the system 100 actually stores ƒ(X) where ƒ(.) is some encryption or hash function 110. During an authentication phase 20, a user enters a candidate password Y 102, the system determines 130 ƒ(Y), and only grants access 150 to the system when ƒ(Y) matches 140 the stored password ƒ(X), otherwise, access is denied 160.
As an advantage, encrypted passwords are useless to an adversary without the encryption function, which are usually very difficult to invert.
Conventional Biometric Based Security Systems
A biometric security system measures physical biometric features to obtain biometric parameters, sometimes called observations. A conventional biometric security system has the same vulnerability as a password based system, which stores unencrypted passwords. Specifically, if the database stores unencrypted biometric parameters, then the parameters are subject to attack and misuse.
For example, in a security system using face recognition system or voice recognition, an adversary could search for biometric parameters similar to the adversary. After suitable biometric parameters are located, the adversary could modify the parameters to match the appearance or voice of the adversary to gain unauthorized access. Similarly, in security system using fingerprint or iris recognition, the adversary could construct a device that imitates a matching fingerprint or iris to gain unauthorized access, e.g., the device is a fake finger or fake eye.
It is not always possible to encrypt biometric parameters due to not only the possible variability of the underlying biometric features, but also in the way the features are measured. This difference can be termed “noise.”
Specifically, biometric parameters X are entered during the enrollment phase. Say that the parameters X are encrypted using an encryption or hashing function ƒ(X), and stored. During the authentication phase, the biometric parameters obtained from the same user can be different. For example, in a security system using face recognition, the cameras used for enrollment and authentication can have different orientations, sensitivities, and resolution. The lighting is usually quite different. Skin tone, hairstyle and other facial features are easy to change. Thus, during authentication, if the newly observed parameters Y are passed through the same encryption function ƒ, the result ƒ(Y) will not match ƒ(X) causing rejection. Similar problems exist with other biometrically based user authentication, such as iris and fingerprint patterns.
Error Correcting Codes
An (N, K) error correcting code (ECC) C, over an alphabet Q, includes QK vectors of length N. A linear (N, K) ECC can be described either by using a generator matrix G, with N rows and K columns, or by using a parity check matrix H, with N-K rows and N columns. The name ‘generator matrix’ is based on the fact that a codeword expressed as a vector w, can be generated from any length K input row vector v, by right multiplying the vector v by the matrix G according to w=vG. Similarly, to check if the vector w is a codeword, one can check whether HwT=0, where a column vector wT is a transpose of the row w.
In the standard use of error correcting codes, an input vector v is encoded into the vector w, and either stored or transmitted. If a corrupted version of the vector w is received, a decoder uses redundancy in the code to correct for errors. Intuitively, the error capability of the code depends on the amount of redundancy in the code.
Slepian-Wolf, Wyner-Ziv, and Syndrome Codes
In some sense, a Slepian-Wolf (SW) code is the opposite of an error correcting code. While an error correcting code adds redundancy and expands the data, the SW code removes redundancy and compresses the data. Specifically, vectors x and y represent the correlated data. If an encoder desires to communicate the vector x to a decoder that already has the vector y, then the encoder can compress the data to take into account the fact that the decoder has the vector y.
For an extreme example, if the vectors x and y are different by only one bit, then the encoder can achieve compression by simply describing the vector x, and the positions of the differences. Of course, more sophisticated codes are required for more realistic correlation models.
The basic theory of SW coding, as well as a related Wyner-Ziv (WZ) coding, are described by Slepian and Wolf in “Noiseless coding of correlated information sources,” IEEE Transactions on Information Theory, Vol. 19, pp. 471-480, July 1973, and Wyner and Ziv in “The rate-distortion function for source coding with side information at the decoder,” IEEE Transactions on Information Theory, Vol. 22, pp. 1-10, January 1976. More recently, Pradhan and Ramchandran described a practical implementation of such codes in “Distributed Source Coding Using Syndromes (DISCUS): Design and Construction,” IEEE Transactions on Information Theory, Vol. 49, pp. 626-643, March 2003.
Essentially, the syndrome codes work by using a parity check matrix H with N-K rows and N columns. To compress a binary vector x of length N to a syndrome vector of length K, determine S=Hx. Decoding often depends on details of the particular syndrome code used. For example, if the syndrome code is trellis based, then various dynamic programming based search algorithms such as the well known Viterbi algorithm can be used to find the mostly likely source sequence X corresponding to the syndrome vector S, and a sequence of side information as described by Pradhan et al.
Alternatively, if low density parity check syndrome codes are used, then belief propagation decoding can be applied as described in “On some new approaches to practical Slepian-Wolf compression inspired by channel coding” by Coleman et al., in Proceedings of the Data Compression Conference, March, 2004, pages 282-291.
Factor Graphs
In the prior art, codes as described above are often represented by a bipartite graph that is called a “factor graph,” see F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor Graphs and the Sum-Product Algorithm,” IEEE Transactions on Information Theory, vol. 47, pp. 498-519, February 2001, G. D. Forney, Jr., “Codes on Graphs: Normal Realizations,” IEEE Transactions on Information Theory, vol. 47, pp. 520-549, February 2001, and R. M. Tanner, “A Recursive Approach to Low-Complexity Codes,” IEEE Transactions on Information Theory, vol. 27, pp. 533-547, September, 1981, all incorporated herein by reference.
Generally, a factor graph is a bipartite graph, containing two types of nodes, called “variable nodes” and “factor nodes.” Variable nodes are only connected to factor nodes and vice-versa. Factor nodes are conventionally drawn using squares, variable nodes are conventionally drawn using circles, and connections between variable and factor nodes are denoted by lines connecting the corresponding circles and squares. Sometimes a symbol, i.e., ‘+’, is drawn inside a factor node to represent the kind of constraint that it enforces.
The variable nodes represent the symbols that are used in the code, and the factor nodes represent the constraints on the symbols. A variable node is only connected to a factor node if it is subject to the corresponding constraint.
Biometric Parameter Coding Prior Art
Prior art related to the current invention falls into three categories. First, there is a great deal of prior art describing feature extraction, recording, and use of biometric parameters unrelated to the secure storage of such biometric parameters. Because our invention is concerned with secure storage, and largely independent of the details of how the biometric parameters are acquired, details of this category of prior art are omitted.
The second class of prior art, which is relevant to the invention, includes the following systems designed for secure storage and authentication of biometrics, “Method and system for normalizing biometric variations to authenticate users from a public database and that ensures individual biometric data privacy,” U.S. Pat. No. 6,038,315; “On enabling secure applications through off-line biometric identification,” by Davida, G. I., Frankel, Y., Matt, B. J. in Proceedings of the IEEE Symposium on Security and Privacy, May 1998; “A Fuzzy Vault Scheme,” by Juels, A., Sudan, M., in Proceedings of the 2002 IEEE International Symposium on Information Theory, June 2002; U.S. patent application Ser. No. 09/994,476, “Order invariant fuzzy commitment system,” filed Nov. 26, 2001; Juels and Wattenberg, “A fuzzy commitment scheme,” in Proc. 5th ACM Conf. on Comp. and Commun. Security, New York, N.Y., pgs. 28-36, 1999; S. Yang and I. M. Verbauwhede, “Secure fuzzy vault based fingerprint verification system,” in Asilomar Conf. on Signals, Systems, and Comp., vol. 1, pp. 577-581, November 2004. U. Uludag and A. Jain, “Fuzzy fingerprint vault,” in Proc. Workshop: Biometrics: Challenges arising from theory to practice, pp. 13-16, August 2004.
FIG. 2 shows some of the details of the basic method described in U.S. Pat. No. 6,038,315. In the enrollment phase 210, biometric parameters are acquired in the form of a sequence of bits denoted E 201. Next, a random codeword W 202 is selected from a binary error correcting code and additively combined with the parameters E using an exclusive OR (XOR) function 220 to produce a reference R 221. Optionally, the reference R can be further encoded 230. In any case, the reference R is stored in a password database 240.
In the authentication phase 220, a biometric parameters E′ 205 are presented for authentication. The method determines 250 the XOR of R with E′ to essentially subtract the two to obtain Z=R−E′=W+E−E′ 251. This result is then decoded 260 with the error correcting code to produce W′ 261. In step 270, if W′ matches W, then access is granted 271, and otherwise, access is denied 272.
That method essentially measures the Hamming distance, i.e., the number of bits that are different, between the enrolled biometric E 201, and the authentication biometric E′ 205. If the difference is less than some predetermined threshold, then, then access is granted. Because the method stores only the reference R, and not the actual biometric parameters E, the method is secure.
Davida et al. and Juels et al. describe variations of the method shown in FIG. 2. Specifically, both encode the biometric data with an error correcting code during the enrollment phase followed by an operation to secure the resulting codeword. Davida et al. hide the codeword by only sending the check bits, while Juels et al. add some amount of noise referred to as ‘chaff’.
U.S. Pat. No. 6,363,485, “Multi-factor biometric authenticating device and method,” describes a method for combining biometric data with an error correcting code and some secret information, such as a password or personal identification number (PIN), to generate a secret key. Error correcting codes, such as Goppa codes or BCH codes, are employed with various XOR operations.
In addition to fixed database access control systems illustrated in FIG. 2, a third class of prior art includes using biometrics for data protection, specifically data protection for mobile devices that include memory, such as laptops, PDAs, cellular telephones, and digital cameras. Because mobile devices are easily lost or stolen, it becomes necessary to protect data stored in mobile devices.
Problems with the Prior Art
FIG. 4 illustrates the problem with existing approaches for storing data D 401. In an encoding process 410, biometric parameters P 402 are obtained from a user and used as a key to encrypt 440 data D to produce the ciphertext C 441. Both P and C are saved in storage 450. When a user wishes to decrypt 420 the data 420, biometric parameters P′ 460 are obtained from a user and compared to the stored biometric P 402. If P′ matches P, 470, then the system allows access and uses P to decrypt the stored ciphertext C to produce the data D 401, otherwise the data are not decrypted 471.
Such a prior art system is only effective as long as the storage medium is not compromised. If an adversary can access such media, then the adversary obtains P and decodes the data.
First, the bit-based prior art method provides dubious security. In addition, biometric parameters are often real-valued or integer-valued, instead of binary valued. The prior art assumes generally that biometric parameters are composed of uniformly distributed random bits, and that it is difficult to determine these bits exactly from the stored biometric. In practice, biometric parameters are often biased, which negatively affect security. Also, an attack can cause significant harm, even if the adversary recovers only an approximate version of the stored biometric. Prior art methods are not designed to prevent the adversary from estimating the actual biometric from the encoded version.
For example, U.S. Pat. No. 6,038,315 relies on the fact that the reference value R=W+E effectively encrypts the biometric E by adding the random codeword W. However, that method achieves poor security. There are a number of ways to recover E from R. For example, if the vector E has only a few bits equal to one, then the Hamming distance between R and the W is small. Thus, an error correction decoder could easily recover W from R, and hence also recover E. Alternatively, if the distribution of codewords is poor, e.g., if the weight spectrum of the code is small and many codewords are clustered around the all zero vector, then an adversary could obtain a good approximation of E from R.
Second, in addition to dubious security, prior art methods have the practical disadvantage of increasing the amount of data stored. Because biometric databases often store data for many individual users, the additional storage significantly increases the cost and complexity of the system.
Third, many prior art methods require error correction codes or algorithms with a high computational complexity. For example, the Reed-Solomon and Reed-Muller decoding algorithms of the prior art generally have a computational complexity, which is at least quadratic, and often a higher order in the length of the encoded biometric.
Fourth, there are fundamental problems with the basic architecture for the mobile security systems known in the prior art. Mobile security systems such as the one shown in FIG. 4 can only be effective if the mobile security system itself is not compromised. Returning to the example of a mobile security system on a laptop, the security can only be effective if an adversary cannot physically access the media where P and C are stored. If an adversary can access such media, e.g., by removing the hard disk from the laptop, then the adversary immediately obtains P which was the encryption key used to generate C and therefore decrypt C.
The main difficulty with prior mobile security systems is that the encryption key corresponding to the user's biometric parameters are stored in the device. Thus, if the device is stolen, then the data can be decoded using the stored parameters.
Fifth, because there are no good methods for performing error correcting coding or syndrome code decoding for the noise structure particular to biometrics, nor has much thought even gone into modeling the noise structure, most prior art on secure biometric systems use a memoryless noise model, or other models that oversimplify the nature of the noise, and do not reflect actual operational conditions. That is, the prior art models do not accurately represent the time varying dynamics of biometric features and the acquisition and measurement processes. Instead, those models assume that the noise is memoryless and has no spatial or temporal structure.
Often, biometric features vary from one measurement to another. For example, in fingerprint biometrics “minutiae” points are often used as the feature set. The relative positions and orientations of minutiae can be quite different during enrollment and authentication. This makes the authentication process difficult. Most straightforward attempts to solve this problem use models that are extremely high-dimensional curd therefore impractical for practical implementations.
Therefore, it is desired to provide a model for biometric data including structured noise. In addition is desired to pre-process the biometric parameters so pre-processed parameters have a form that is best suited for encoding and decoding using channel codes.