Companies are increasingly turning to cloud storage for their data storage needs. Storing data in the cloud helps companies to lower expenses by, for example, reducing the need to maintain physical servers and other hardware resources. Similarly, cloud storage vendors are continuously seeking new ways to reduce their costs. Eliminating redundant data is one way a cloud storage vendor can reduce costs. Eliminating redundant data can significantly shrink storage requirements and improve bandwidth efficiency. Removing redundant data lowers storage costs as fewer disks are needed. Removing redundant data also helps to conserve electricity to power and cool the disks or tape drives.
Deduplication is a process for removing redundant data. In particular, if two objects are duplicates of each other, then only one of the objects needs to be stored. Thus, the amount of data to be stored can be reduced. Eliminating redundant data in a cloud environment, however, is difficult because the data is often encrypted by the customer of the cloud storage vendor for security purposes. Thus, a vendor's cloud storage system may include many redundant data objects across its customers, but which appear to be different because of the encryption. It can be desirable to reduce the amount of redundant data that is stored in order to reduce the computing costs for the vendor. Such cost savings may be passed to the customers of the cloud storage vendor.
Storage systems such as EMC Data Domain rely on de-duplication to achieve significant data compression. In a multi-tenant environment, backups from different tenants are stored in the same storage system. Tenants do not mutually trust each other and do not want other tenants to read their data. What is needed is an encryption system that allows storage systems to process de-duplicated data without compromising the security of the tenant data while maintaining security between tenants who share the stored de-duplicated data.