The exemplary embodiment relates to data matching when the data to be matched is imperfect, e.g., contains errors, and finds particular application in connection with matching of encrypted data with a fully homomorphic encryption scheme.
Fuzzy Private Matching (FPM) is a useful method for maintaining privacy when a client wishes to make a query in a server's database to find elements that are approximately equal to elements of his own data set. The exchange protocol is such that during all the communication and processing steps, the query remains private (i.e., unknown to the server) and the content of server's database, apart from the matching elements, remains hidden to the client. Such techniques find application in biometrics, where private, personal characteristics, such as fingerprints, DNA, or iris patterns are commonly used to provide authentication and user access control (UAC). Here, exact values are often difficult to measure and thus fuzzy matching is desired, while maintaining the privacy of private data. Similarly, in matching license plate numbers, an optical character recognition (OCR) component may be used to identify a sequence of characters from an image of the license plate, but makes some errors in recognition. Fuzzy matching the OCR-recognized character sequence with a license plate number stored in a database may be desired, while maintaining the private information as secure as possible.
Homomorphic encryption schemes allow a specified mathematical operation to be performed on encrypted data. The resulting ciphertext, when decrypted, contrary to classical encryption schemes, provides a value that is equal to the result of performing the same operation on the original plaintexts. For an encryption scheme ε, the values a and b and an operator op, a homomorphic encryption property can be expressed as follows:ε(a)opε(b)=ε(a opb)
The operator can be a standard mathematical operator, such as multiplication or addition. In some cases, the operator can be different on each side of the equation, e.g., the multiplication of encrypted data can correspond to the addition of the plaintext.
An encryption scheme is considered partially homomorphic if only one arithmetic operation is possible (e.g., only addition or only multiplication). Early protocols all make use of partially homomorphic systems. See, for example, Michael J. Freedman, et al., “Efficient private matching and set intersection,” EUROCRYPT 2004, pp. 1-19 (2004), hereinafter, “Freedman 2004”); Lukasz Chmielewski, et al., “Fuzzy private matching,” ARES 08, pp. 327-334 (2008), hereinafter, “Chmielewski 2008”; and Qingsong Ye, et al., “Efficient fuzzy matching and intersection on private datasets,” ICISC 2009, pp. 211-228 (2010), hereinafter, Ye 2010. These references employ a semantically secure, additively homomorphic public-key cryptosystem, such as the Paillier cryptosystem. See, Pascal Paillier, “Public-key cryptosystems based on composite degree residuosity classes,” EUROCRYPT99, pp. 223-238 (1999). These systems provide ciphertext additions and scalar multiplication only, but not multiplication between ciphertexts.
An encryption scheme is said to be fully homomorphic (FHE) if it provides a way to compute both addition and multiplication. Other homomorphic operations are possible, e.g., exclusive or in the case of the Goldwasser Micali encryption scheme or vector rotation for the Brakerski-Gentry-Vaikuntanathan (BGV) encryption scheme. See, Zvika Brakerski, et al., “Fully homomorphic encryption without bootstrapping,” Cryptology ePrint Archive, Report 2011/277 (2011). Fully homomorphic encryption allows a server to receive encrypted data and perform arbitrarily-complex dynamically-chosen computations on that data while it remains encrypted, despite not having access to the secret decryption key.
The first fully homomorphic encryption scheme to be identified was based on ideal lattices. See, Craig Gentry, “Fully homomorphic encryption using ideal lattices,” Proc.41st Annual ACM Symposium on Theory of Computing, STOC '09, pp. 169-178 (2009), hereinafter, “Gentry 2009”. The security for this scheme is based on the Closest Vector Problem. Other FHE encryption schemes were later developed, such as BGV. An implementation of BVG is described in Shai Halevi, et al., “Design and implementation of a homomorphic encryption library,” MIT Computer Science and Artificial Intelligence Laboratory manuscript (2013), hereinafter, Halevi 2013.
TABLE 1 provides example encryption schemes and the operations permitted.
TABLE 1Examples of homomorphic schemesCryptosystemHomomorphic operationsNotesPaillierε(a) · ε(b) = ε(a + b mod m)m being the moduluspart of the public keyElGamalε(a) · ε(b) = ε(a · b)homomorphicmultiplicationGoldwasser-ε(a) · ε(b) = ε(a ⊕ b)⊕ being the exclusive-orMicalibetween a and b
Some of the homomorphic operations allowed by the BGV cryptosystem over encrypted data include addition, multiplication, right shift, and right rotation.
Freedman 2004 addresses providing FPM in homomorphic protocols and suggest a 2-out-of-3 protocol, based on polynomial encoding, for solving the FPM problem (i.e., the fuzziness threshold is fixed to 2 and the size of a word is fixed to 3). Chmielewski 2008 shows, however, that the 2-out-of-3 protocol proposed by Freedman 2004 is not secure in that the client is able to discover words in the server's set, even if those words are not present in his own. Another problem of the Freedman protocol is how to provide an efficient FPM protocol that will not incur a (tT) factor in the communication complexity.
Chmielewski 2008 proposes two other protocols for solving the FPM problem, FPM-CHM1 (a polynomial encoding based protocol) and FPM-CHM2 (based on linear secret sharing). They propose a correct solution for t-out-of-T. Ye 2010 shows, however that FPM-CHM2 is insecure. They propose another solution, based on polynomial encoding and a share-hiding random error-correcting threshold secret sharing scheme, based on interleaved Reed-Solomon codes, referred to as FPM-YE.
Common to all these FPM protocols is that a partially homomorphic encryption scheme is used to provide the computation on encrypted data capability. All of them make reference to the Paillier cryptosystem, which is an additive only scheme (with multiplication allowed only between an encrypted value and a scalar). Additionally, the performance of the protocols still considered secure, in terms of communication and computation time, may be prohibitive for some applications.
There remains a need for an encryption scheme which is secure and which provides acceptable performance for practical applications.