Many users store various types of documents in a remote repository (commonly known as “cloud storage”), administered by an external entity. As the term is generally used herein, a document can correspond to any unit of information, such as a text-bearing document, a music file, a picture, a financial record, and so on. A user may opt to store documents in the remote repository for various reasons, e.g., based on factors pertaining to convenience, accessibility, storage capacity, reliability, etc.
Contractual obligations may require the entity which administers the remote repository to minimize the risk of unauthorized access to a user's documents. However, from a technical perspective, there may be little which prevents the entity itself from accessing and examining a user's personal documents. This may understandably unsettle a user. For instance, the user's documents may contain sensitive information that the user does not wish to divulge to any person, including the entity which administers the remote repository.
A user may address this concern by encrypting the documents and storing the documents in encrypted form at the remote repository. This approach effectively prevents the entity which administers the remote repository (or anyone else) from examining the documents. However, this approach also prevents the user from performing any meaningful operations on the documents that are stored in the remote repository. For instance, the encryption of the documents precludes the user from performing an on-line search of the documents. The user may address this situation by downloading all the documents and decrypting them. But this solution runs counter to the user's initial motivation for storing the documents in the remote repository.
To address this situation, the cryptographic community has developed a technique that is commonly referred to as Searchable Symmetric Encryption (SSE). One such SSE technique is described in Curtmola, et al., “Searchable Symmetric Encryption. Improved Definitions and Efficient Constructions,” Proceedings of the 13th ACM Conference on Computer and Communications Security, 2006, pp. 79-88. Another SSE technique is described in Sedghi, et al., “Adaptively Secure Computationally Efficient Searchable Symmetric Encryption,” Internal Report, Centre for Telematics and Information Technology, University of Twente, 2009. Curtmola's approach, for example, operates by storing an encrypted index together with the encrypted documents at a remote repository. The user then generates and submits a search token which is deterministically derived from a search term, but which conceals the search term. The remote repository then uses the encrypted index to identify and return a list of document identifiers that are associated with the search term. In this approach, the remote repository does not learn the identity of the search term associated with the search token. Nor does the remote repository learn the identity of the documents conveyed in the search results.
However, there is room for improvement in existing SSE techniques. For example, existing SSE techniques do not provide suitably efficient mechanisms for updating a corpus of documents (and associated index information) stored in the remote repository. The user may make changes to a local copy of the index and then send a complete updated index to the remote repository. However, this solution is burdensome and bandwidth-intensive, and again runs counter to the initial motivation for managing documents at a remote location.