The Internet provides opportunities for cooperative computing. With cooperative computing, users and providers can exchange goods, services, and information. The Internet can also provide access to a classifier that can be used to classify data or signals.
Data classification in general is well known in the art. Of particular interest are binary classifiers. Such classifiers simply give a ‘yes’ or ‘no’ answer to indicate whether particular data belongs to a particular class or not.
Specifically, binary classification is the task of classifying objects into two groups on the basis of whether they have some predetermined property or not. Typical binary classification tasks include face recognition in images, medical testing of clinical data, and quality control of products. Generally, computer implemented classifiers automatically ‘learn’ a classification system. Well known methods suitable for learning binary classifiers include decision trees, Bayesian networks, support vector machines (SVM), and neural networks.
Private information classification (PIC) enables two parties, for example, Alice and Bob, to engage in a protocol that allows Alice to classify data using Bob's classifier without revealing anything to Bob (not even the classification result) and without learning anything about Bob's classifier, other than an answer to a classification request. PIC brings together the fields of machine learning and cooperative, multi-party computing, which is a sub-field of cryptography.
Secure multi-party computation is described by Yao, “How to generate and exchange secrets,” 27th FOCS, pp. 162-167, 1986. That gave a solution to a general two party problem. As a concrete example, consider the well known ‘millionaire problem’. Two parties want to find who has a larger number without revealing anything else about the numbers themselves.
Goldriech et al. extended the solution to n>2 parties, some of whom might be cheating, O. Goldreich, S. Micali and A. Wigderson, “How to play any mental game—a completeness theorem for protocols with honest majority,” 19th ACM Symposium on the Theory of Computing, pp. 218-229, 1987.
However, the original theoretical construct was too demanding to be of practical use. An introduction to Cryptography is given by B. Schneier, in Applied Cryptography, 1996, and a more advanced and theoretical treatment is given by O. Goldreich, in Foundations of Cryptography, 2004.
Since then, many secure protocols have been described for various applications. Relevant to the present invention are secure dot-products and oblivious polynomial evaluation, learning decision trees, and private information retrieval (PIR), Y. C. Chang and C. J. Lu, “Oblivious polynomial evaluation and oblivious neural learning,” AsiaCrypt: Advances in Cryptology. LNCS, Springer-Verlag, 2001; B. Chor, O. Goldreich, E. Kushilevitz and M. Sudan, Private Information Retrieval,” FOCS, 1995; Y. Lindell and B. Pinkas, “Privacy preserving data mining” Advances in Cryptology—Crypto2000, LNCS 1880, 2000; and M. Naor and B. Pinkas, “Oblivious Polynomial Evaluation,” Proc. of the 31st Symp. on Theory of Computer Science (STOC), pp. 245-254, May 1-4, 1999.
In a secure dot product, Alice and Bob respectively determine a dot-product of their private data vectors without revealing anything other than the result to each other. In some variants of the dot product protocol, Alice obtains the sum of the dot-product and some random number that is known only to Bob, while Bob learns nothing. This serves as a building block for more complex protocols.
In oblivious polynomial evaluation (OPE), Bob has a polynomial P(x) and Alice has a particular value x. Alice evaluates the polynomial at the value x without letting Bob know the value x. Bob does so, without revealing the polynomial.
The OPE has also been used for learning a decision tree where the training data are held by two parties. The parties want to jointly learn a decision tree without revealing their private data to each other. In the end, each party learns the decision tree that was trained using the combined data, but the private data of one party is not revealed to the other party.
PIC is an extension of private information retrieval (PIR). In PIR, Alice is interested in retrieving a data item from Bob's database without letting Bob know which element Alice selected. For example, Bob has a database of stock quotes and Alice would like to obtain the quote of a particular stock without letting Bob know which stock Alice selected. Bob is willing to let her do so. However, Bob wants to ensure that Alice can access one, and only one, stock quote.
A number of ways are known for reducing the communication and computation resources required by PIR, A. Beimel, Y. Ishai, E. Kushilevitz, and J.-F. Raymond, “Breaking the O(n1/(2k−1)) Barrier for Information—Theoretic Private Information Retrieval,” FOCS, 2002 and E. Kushilevitz and R. Ostrovsky, “Replication Is Not Needed: Single Database, Computationally-Private Information Retrieval,” FOCS 1997.