The application of biometrics has become a popular solution for authentication or identification, often because of its convenience. However storage of biometric features introduces both security and privacy risks for the user, since this will make it easier for an adversary to misuse them. Security risks involved with storing biometric features include the reproduction of fake biometrics from the features, e.g. rubber fingers. Fake biometrics can be used to leave fake evidence at crime scenes or to obtain access to private information or services. Moreover, there are privacy concerns involved with storing biometric features. Some biometrics are known to reveal diseases and disorders of the user and unprotected storage allows for cross-matching between databases.
These problems cannot be solved with a simple encryption/decryption scheme, since we cannot trust the verifier. In many cases design specifications do not allow trusting a verifier. For example trusting a malicious verifier might result in user identities of people visiting that verifier being stolen and sold. Once biometric data has been compromised it is public forever and cannot be used in a security application anymore. Biometric data is inherently part of a user, e.g. one cannot change a user's fingerprints. It is desirable that the original features cannot be derived from whatever information is stored on the biometric.
Although, hashes are attractive because it is computationally infeasible to recover from the input of a cryptographic hash function from its output value, hash functions cannot be directly applied. In fact, the diffusion of the hash function makes it difficult to apply to (noisy) biometric data. The biometric features will be slightly different during each observation, thus the outputs of the two hash operations that depend on the biometric features will be unrelated. Applying a hash function to the measurements makes it impossible to do the verification based on similarity in the hashed domain.
One approach to correct the differences over the observations was done uses so-called helper data, i.e. auxiliary data to handle the errors between subsequent observations. Many constructions have been proposed, which all can be considered helper data schemes since they store user dependent data.
There are so-called digital and analog variants of helper data, referring to applications of helper data in error correcting codes and quantization phases respectively; analog helper data may well be digitally represented. This document also refers to digital and analog helper data as error-correcting helper data and quantization helper data, respectively.
1. Analog, such as Biased Quantization. This is may be interpreted as a user-dependent bias in the quantization of biometrics in the analog domain, i.e., before quantization. An example is given by the published international application WO/2004/104899, with title “Method And System For Authentication Of A Physical Object”, by the same applicant and included herein by reference discloses a system for authenticating a physical object including an enrolment device and an authentication device.
2. Digital, such as Code shifting: During the verification, the quantified biometrics are “shifted” towards a valid code word. This shifting can be an exor (XOR) operation with a vector of low-hamming weight that shifts the extracted biometric to the nearest code word. This operation occurs fully in the digital domain. After this shift an error correcting code (ECC) can be applied. This can correct a certain, but limited number or errors. Typically codes can be effective if the bit error rate lies below say 10%. An example is given in “A fuzzy commitment scheme,” by A. Juels and M. Wattenberg.
The above two methods may be combined in a two stage helper data system.