Various parties (e.g., corporations, governmental agencies and natural persons) face a common dilemma: how can parties share specific information (e.g., health care data, customer prospect lists, an adversary watch list, a black list or a list of actual or potential problematic entities) that can assist the parties via business optimization, improved analysis, or detecting the presence of potential adversary or other problematic parties, while maintaining the security and confidentiality of such information.
Hesitation to contribute or otherwise disclose, as well as laws governing the use and disclosure of certain information is predicated upon a concern that the information may be subjected to unintended disclosure or used in a manner that may violate privacy policies or otherwise cause damage to the party. Such damage may include identity theft, unauthorized direct marketing activities, unauthorized or intrusive governmental activities, anti-competitive practices, defamation, credit damage, or economic damage.
Conventional systems use various means to transfer data in a relatively confidential manner within or between parties. Although this technology has proven to be useful, it would be desirable to present additional improvements. For example, some conventional systems use a reversible encryption method, which modifies the data to engender some level of confidentiality. The encrypted data is transmitted to a recipient, who uses a comparable decryption method to return the encrypted data to its original format. However, once the data is decrypted, such data is subject to potential loss or use in an unapproved or illegal manner that may cause the very damage that the encryption process was intended to prevent.
Other conventional systems use irreversible cryptographic algorithms, or one-way functions, such as MD-5 (also referred to as message digest 5), SHA-1 or SHA-256, to obfuscate sensitive or confidential data. Existing irreversible cryptographic algorithms cause data to be undecipherable and irreversible to protect the confidentiality and security of the data. The irreversible one-way function, when applied to data, results in an identical unique value for the same data regardless of the data source. Therefore, irreversible cryptographic algorithms are often used as a document signature, to make unauthorized document alteration detectable when the document is being shared across parties. For example, suppose a phone number in an original document is altered (for example, by changing the formatting), and irreversibly encrypted. If the original, unaltered data is also irreversibly encrypted, the two encrypted values are different, indicating that one of the electronic documents has been altered.
However, schemes with irreversible cryptographic algorithms comprise an inherent vulnerability to phonebook attacks. Such phonebook attacks are all but theoretical and allow for disclosure of the private data with limited effort. If e.g., a party Pi and a party Pj share their customer databases with Personally Identifiable Data (PII) through a conventional hashing scheme, in which each customer record consists of a unique identifier ID and a corresponding set of hashes of the PII, a phonebook attack might be performed. If e.g., party Pi is not playing fair, it might compute a set of hashes on a phonebook or another large data set and match it with the hashed data set obtained from Pj. This attack, which is referred to as a phonebook attack, allows party Pi to reveal nearly all PII contained in party Pj's dataset and defeats the purpose of the hashing.
Accordingly, it would be desirable to be able to provide improved solutions for comparing data sets in a privacy enhanced manner, and to provide improved solutions for comparing data sets in a privacy preserving manner.