Unless otherwise indicated herein, the materials described in this section are not prior art to the claims in this disclosure and are not admitted to the prior art by inclusion in this section.
Many complex networked computing systems collect large amounts of data in monitoring processes to track statistical information that is recorded by, for example, distributed sensor networks, automobiles, home appliances, medical information systems, and the like. Of course, one well-known drawback to such large scale data collection is that this data directly or indirectly reveals sensitive information about humans who, for example, are monitored by the sensor networks, drive automobiles, live in homes, and receive medical care.
Many prior-art systems rely on a trusted aggregator, which is a computing system that receives encrypted data from the individual client computing devices, decrypts the data, and produces an aggregated output that includes general statistics for a large number of client computing devices that do not enable outside observers to identify the particular input data from each client computing device that could then be associated with the activities of particular humans. For example, while individual records from medical monitoring devices may identify that an individual patient visited a hospital, the aggregator collects a large number of records from the monitoring devices to produce an output detailing the total number of patients that visited the hospital, which cannot be used to identify an individual patient.
If operated properly, the aggregator provides “differential privacy”, which is to say that the output of the PSA system does not enable an observer to determine the individual inputs from clients and corresponding humans who provided individual contributions to the final output. As noted above, most prior-art systems rely on “trusted” aggregators. While the word “trusted” has positive connotations in common usage, in the field of information security the requirement to have a “trusted” aggregator is actually a disadvantage because the aggregator must be trusted to maintain the privacy of data from the individual client devices and corresponding human users. When the aggregator is operated by a third party, such as a corporate or governmental entity, the individuals who operate the client computing devices that transmit information to the aggregator must not only trust the operator of the aggregator, but must further trust that the aggregator is immune to being compromised by unauthorized attackers who would seek to collect private information by compromising the security of the decrypted data that the trusted aggregator processes during operation.
To reduce the privacy concerns described above, private stream aggregation (PSA) systems are known to the art. In a PSA system each client transmits encrypted data (a “stream”) to an untrusted aggregator that cannot decrypt the data from each client. The aggregator does not need to be trusted as far as privacy of data from individual clients is concerned. In the PSA system, the client not only transmits the encrypted data, but includes a client-specific secret and random noise in the actual data prior to encryption using a public key that is associated with the aggregator. The untrusted aggregator is not capable of recovering the original plain text data from the encrypted data received from the client. Instead, the aggregator combines multiple encrypted sets of data from different clients together using a homomorphic operation and is only capable of decrypting a combination of all the inputs to produce an aggregate value that is referred to as a “noisy sum” of all the input stream data from the individual clients. The noisy sum is an aggregate piece of information about all of the inputs, such as the total number of hospital visits described above, but the untrusted aggregator never decrypts plaintext data from individual clients and cannot determine the specific contribution of each encrypted client stream to the final output, such as determining that a particular person actually visited the hospital. Thus, the PSA system also provides differential privacy as described above with the added advantage that the aggregator system does not have to be trusted in order to provide differential privacy. An example of a prior-art PSA system is described in more detail in a paper by Elaine Shi et al., Privacy-Preserving Aggregation of Time-Series Data, Network and Distributed System Security Symposium (NDSS), 2011.
The prior-art PSA systems have drawbacks related both to practical performance and to future security, however. The first drawback is related to performance. Each transmission of data from a client to the aggregation server can only efficiently contain a single bit (e.g. logical “0” or “1”) of information in the message that is sent to the aggregation server. This limitation is due to the use of the discrete logarithm computation as the basis for the homomorphic encryption process, and limits the practical performance of the prior-art PSA system. Thus, the client computing devices cannot communicate large amounts of data to the aggregation server in an efficient manner while also maintaining the security properties of the PSA system.
The second drawback to prior-art PSA systems, and many asymmetric public/private key cryptographic systems generally, is related to the development of quantum computers. A quantum computer relies on the properties of quantum physics related to the entanglement and superposition of particles to enable the efficient computation of certain classes of mathematical problems that cannot be solved “quickly” (in polynomial time) using even the most powerful “classical” computers (i.e. existing commercially available computers). More particularly, one common type of quantum computer implements quantum gates that perform operations on data stored in multiple quantum bits (“qubits”). Unlike traditional memory registers in a classical computer, when operating properly the qubits simultaneously store 2N possible states where N is equal to the number of qubits that are entangled and operate together in the quantum computer via the superposition property of quantum physics. For example, a 16 bit memory register in a standard classical computer stores a single state formed from 16 individual binary values out of a possible 216 states. A quantum computer, however, can perform operations on 16 entangled qubits that, at least theoretically, store all 216 states simultaneously and enable the quantum gates to perform simultaneous calculations on all 216 states. Other forms of quantum computers including those that rely upon quantum annealing and adiabatic quantum computation are also known to the art, although the underlying physical operating principles of these quantum computers may be less effective in attacking existing asymmetric cryptographic systems.
Rudimentary quantum computers are known to the art and these quantum computers, in some instances operating in conjunction with classical computers as used in Shor's algorithm or other algorithms, can provide solutions to simple discrete logarithm problems, prime factorization problems, or other mathematical problems that form the basis for existing asymmetric public/private key cryptography. Existing quantum computers—or at least existing publicly known quantum computers—can only be manufactured to operate with far too few quantum gates and entangled qubits, typically less than 100 qubits, to solve the mathematical problems that could enable an attacker to identify a private key that corresponds to a given public key in existing cryptographic systems. For example, those of skill in the art estimate that a quantum computer with approximately 4,000 qubits and 100 million quantum gates could break 2048-bit RSA or equivalent ElGamal keys in a practical amount of time using Shor's algorithm that employs a combination of the quantum computer with existing classical computers to break the keys. In the 2048-bit key example, a practical quantum computer requires more than 2048 qubits corresponding to the 2048 bit key since additional qubits are required for error correction, and the cited numbers of qubits and quantum gates are only an estimate. While existing quantum computers are not a direct threat to present cryptographic systems, there is a reasonable likelihood that future quantum computers will be manufactured with a sufficient complexity to enable practical attacks on existing asymmetric cryptographic systems including those that rely on the prime factorization and discrete logarithm mathematical problems. Those of skill in the art are aware of the potential threat posed by quantum computers that could be used to recover the private keys in existing asymmetric cryptographic systems that are otherwise resistant to even the most powerful classical computers.
As described above, existing PSA systems have drawbacks related both to performance and potential security issues related to future advances in quantum computers. Consequently, improvements to PSA systems that provide differential privacy to clients while reducing or eliminating these problems to existing systems would be beneficial.