The present invention relates generally to the field of information technology, and, more particularly, to systems and techniques for deduplication.
Companies are increasingly turning to cloud storage for their data storage needs. Storing data in the cloud helps companies to lower expenses by, for example, reducing the need to maintain physical servers and other hardware resources. Similarly, cloud storage vendors are continuously seeking new ways to reduce their costs. Eliminating redundant data is one way a cloud storage vendor can reduce costs. Eliminating redundant data can significantly shrink storage requirements and improve bandwidth efficiency. Removing redundant data lowers storage costs as fewer disks are needed. Removing redundant data also helps to conserve electricity to power and cool the disks or tape drives.
Deduplication is a process for removing redundant data. In particular, if two objects are duplicates of each other, then only one of the objects needs to be stored. Thus, the amount of data to be stored can be reduced. Eliminating redundant data in a cloud environment, however, is difficult because the data is often encrypted by the customer of the cloud storage vendor for security purposes. Thus, a vendor's cloud storage system may include many redundant data objects across its customers, but which appear to be different because of the encryption. It can be desirable to reduce the amount of redundant data that is stored in order to reduce the computing costs for the vendor. Such cost savings may be passed to the customers of the cloud storage vendor.
Thus, there is a need to provide systems and techniques for facilitating the deduplication of encrypted data.