1. Field of the Invention
This invention relates to computer systems and, more particularly, to protection of data within computing systems.
2. Description of the Related Art
It is common practice for individuals and enterprises to protect data that resides on a variety of computer hosts via some type of backup mechanism. For example, numerous client devices may be coupled to a network to which a backup server is also coupled. The backup server may be further coupled to one or more tape drives or other backup media. A backup agent on each host may convey data files to the backup server for storage on backup media according to a variety of schedules, policies, etc. To facilitate restoring backup files, the backup server may maintain a catalog of the files that have been stored on the backup media. When a client wishes to restore a file, the server may present a view of the catalog or a portion of the catalog from which the client may make a selection. Once the client has indicated which file is to be restored, the backup server may initiate a restoration process.
In order to minimize the size of storage pools required to store backup data, Single Instance Storage (SIS) techniques are sometimes employed at each backup location. In SIS techniques, data is stored in segments, with each segment having a fingerprint that may be used to unambiguously identify it. For example, a data file may be segmented, and a fingerprint calculated for each segment. Duplicate copies of data segments are replaced by a single instance of the segment and a set of references to the segment, one for each copy. In order to retrieve a backup file, a set of fingerprints is sent to a backup server, where it is compared to the fingerprints of data stored in a storage pool. For each matching fingerprint, a data segment is retrieved. The resulting segments are re-assembled to produce the desired file.
Unfortunately, the restoration process may be slow and inefficient. For example, because many clients typically share a small number of backup servers, the restoration process may be slowed by network latencies. Restoration may be further slowed if a slow or busy WAN link connects the backup server to its clients. Also, for tape-based backup, once a file has been identified for restoration, administrator assistance may be required to mount the particular tape that contains the desired file, increasing expense and turnaround time. In addition, files that have not been backed up are not available for restoration.
An alternative approach to data protection is to distribute responsibility for backups to hosts themselves organized into a peer-to-peer network. Peers may provide some amount of disk storage space for backup purposes. However, mobile hosts may connect and disconnect from a network on a frequent basis, making them unavailable to participate in backup operations at various times. In addition, participating hosts are likely to have a variety of capabilities. Some hosts, such as mobile computers, may have limited storage capacity. Some hosts may have slow network connections. Other hosts may have limited ability to participate in backup operations due to requirements placed on them by other applications that they may run.
In view of the above, an effective system and method for distributing and housing backup images that accounts for these issues is desired.