A critical privacy protection that users crave is preventing information they consider sensitive from being inadvertently leaked as they query or access Internet services. In other words, users see the problem of preserving their access privacy to online services as an important concern that must be addressed. A cryptographically sound approach to protect access privacy it to use the technique of private information retrieval (PIR). PIR schemes, as are known in the art, allow a user to access data from service providers without the service providers being able to learn any information about which particular data item was accessed or retrieved.
One such PIR scheme requires a database to be replicated to two or more servers that are assumed not to be colluding. A query received from a user is separated into different parts, and each part is sent to a different server. The returned result from each server, based on the portion of the query each received, is returned back to the client where the results are combined to provide a complete response to the full query. However, concerns remain about the practicality of having an organization replicate its database to the servers of multiple different cloud services that are assumed not to collude. Replicating a database to multiple independent cloud servers increases the chances of the data being broken into, used without consent, or used for illegitimate purposes. In short, it is inconceivable that an organization would ever want to give out a copy of its database especially as it may represent their intellectual property, trade secret, or asset.
To address the database replication problem, a random server model of PIR was introduced by Gertner et al. (Yael Gertner, Shafi Goldwasser, and Tal Malkin. A Random Server Model for Private Information Retrieval or how to Achieve Information Theoretic PIR Avoiding Database Replication. In RANDOM '98, pages 200-217, 1998). This model attempts to separate the task of providing query privacy from that of information retrieval using auxiliary random servers running databases containing random data. The database server uses the service of two or more random servers to generate an encrypted and permuted version of its database and to help keep the user queries private. Of particular interest in this solution are universal random servers, which are a type of auxiliary servers holding random data that is completely independent of the content of the database. Gertner et al. proposed a scheme that achieves total independence, i.e., all random servers are of the universal type—they contain no information derived from content of the dataset, thereby addressing the database replication problem. In other words, the scheme provides user privacy according to the underlying PIR scheme used with the scheme, and database privacy (no single server or a coalition can learn any information about the content of the database).
Gertner et al.'s secure multi-party computation (SMC) protocol enables the server holding a database x and two auxiliary random servers each holding a random database α and a pseudorandom permutator π, to compute an initial oblivious database y=π(x⊕α). However, their protocol must be rerun to re-compute y after a large (e.g., sublinear) number of queries have been run or whenever the database x is updated. But, naively updating the oblivious database y with updates from x leaks information about π. In other words, an update to some record xi would require an update to be made to some oblivious database block yj and the server maintaining the database is able to learn that j=π−1(i). Hence, finding a periodic downtime to rerun the SMC is prescribed to update the oblivious database y. As a result, users would have to suspend making query requests during the SMC protocol rerun because the random servers would be preoccupied. Such a wait is undesirable in environments where database changes are frequent and query downtimes are unacceptable. The second problem with this scheme is that it expects the same random database to be used to mask multiple databases belonging to different organizations, which can lead to significant attacks in practice (i.e., the attacker learns r by running several queries across databases and uses this knowledge to learn the blocks of a target y much faster).