Field of the Invention
This invention relates to the field of computer data storage. More particularly, the invention relates to a system and method using de-duplication and encryption techniques to store files for a plurality of users in a file storage pool on a server computer system.
Description of the Related Art
Computer systems generally store information as files organized by a file system. Each file may be stored on a storage device, such as a disk drive, optical drive, or tape drive. It is often necessary to back up files by copying them to another storage device. For example, backup operations may be performed to guard against hardware failure or data loss, to provide a snapshot of files at a particular point in time, or to replicate files for other purposes.
In a networked computing environment, a plurality of client computer systems may each back up files to a backup server computer system. It is possible that an identical file is stored on multiple client computer systems. For example, two or more client computer systems may each store a copy of a file, where the data in each copy is identical. For example, client computer systems that execute the same operating system or the same software applications often have many identical files.
De-duplication techniques can be utilized so that only a single copy of each file is stored on the backup server computer system. For example, for each client computer system that has a copy of a particular file, the backup server computer system may store respective file metadata representing that copy. The portions of file metadata associated with each respective copy of the file may all reference a single instance of the file data (the actual contents of the file). In this way, the backup system can avoid the need to store multiple copies of identical files on the backup server computer system. A storage system which uses de-duplication to store and reference a single instance of data in order to avoid storing multiple copies of identical data is generally referred to as a single instance storage system.
It is sometimes desirable to store the files on the backup server computer system in encrypted form, e.g., to prevent unauthorized use of the files. An encryption algorithm typically uses a key (e.g., information such as a series of bits) to transform the file data into an encoded form. Thus, for example, each client computer may have its own key which is used to encrypt its files before transmitting them to the backup server computer system so that the files received from each client computer are unreadable by any user or application who does not possess the client computer's particular key.
However, since the process of encrypting a file involves transforming the file data into an encoded form that depends upon the encryption key that is used, different copies of an identical file encrypted by different client computers will produce different encrypted data since the encryption keys for the client computers are different from each other. This is a problem for single instance storage systems because even though the original file data is identical, the resulting encrypted data produced by the different client computers is not identical.