1. Field of the Invention
This invention relates to computer systems and, more particularly, to backup management in distributed computer systems.
2. Description of the Related Art
Today's enterprise environments typically comprise a wide variety of computing devices with varying processing and storage resources, ranging from powerful clusters of multiprocessor servers to desktop systems, laptops, and relatively low-power personal digital assistants, intelligent mobile phones and the like. Most or all of these devices are often linked, at least from time to time, to one or more networks such as the Internet, corporate intranets, departmental or campus local area networks (LANs), home-based LANs, etc. Furthermore, most or all of these devices often store data, at least temporarily, that if lost or corrupted may lead to considerable rework and/or to lost business opportunities. While perhaps not as important from a business perspective, the loss or corruption of personal data such as photographs, financial documents, etc., from home computers and other devices outside corporate boundaries may also have unpleasant consequences. Backing up the data locally, e.g., to devices stored at the same building or site as the source data, is typically not sufficient, especially in the event of catastrophic events such as hurricanes, tornados, floods, fires and the like. Furthermore, while local backups may be relatively fast, in aggregate they often result in multiple copies of the same files being backed up: for example, even though many of the operating system files in one backup client system may be identical to operating system files in another backup client system, local backups initiated from each of the clients may typically store independent backup versions of the data from each client separately, including duplicate backed-up copies of the identical files.
In order to enable recovery from localized catastrophic events, various techniques for backup to remote sites have been developed over the years. Many traditional disaster recovery techniques are often centrally controlled and expensive, however, and are therefore typically limited to protecting the most important, mission-critical subsets of business data. In recent years, in order to take advantage of the widening availability of Internet access and the mass availability of cheap storage, peer-to-peer (P2P) backup management techniques have been proposed. In such P2P backup management environments, for example, each participating device may be allowed to back up data objects such as files into a P2P network or “cloud” (a large distributed network, such as hundreds or thousands of hosts connected to the Internet). In the event of a failure at the source device (the device from which the data objects were uploaded), the backed up data may be retrieved from the P2P cloud. P2P backup management software may be installed at the participating devices to enable discovery of target devices to store backup data, to schedule and perform the P2P backups, to search for previously backed-up data within the P2P cloud, and to retrieve backup data from other devices of the P2P cloud as needed. Often, few restrictions are placed on devices for membership in P2P networks: e.g., even a home personal computer that is only powered on for a few hours a day may be allowed to participate in a P2P network.
Unfortunately, the amount of source data to be backed up can be quite large—for example, if conventional P2P techniques are used, several gigabytes of data may have to be backed up from a single laptop computer in order to be able to support full recovery from a disk crash or other failures at the laptop. Furthermore, the total amount of data uploaded into the P2P network for a backup of a given source data set is often substantially greater than the size of the source data itself. This data expansion may be required because few guarantees can usually be provided regarding the availability of any given device in the P2P network. If, in a naïve implementation of P2P backup management, an important file was backed to only one or two target devices of the P2P network from a source device, it is quite possible that none of the target devices that store the file may be online or available when the file has to be recovered. Source data to be backed up is therefore typically encoded for error correction (e.g., using an erasure code) and/or replicated at the source device prior to uploading to several targets in the P2P cloud, so that the probability of being able to recover the source data is increased. (In general, an erasure code transforms a data object containing n blocks into a data object with m blocks, where m is large than n, such that the original data object can be recovered from a subset of those m blocks.) The expansion of the source data set to increase availability of the backed-up version further adds to the upload bandwidth requirements from the source devices. Since many of the devices whose data is to be backed up into the P2P network often have intermittent connectivity to the P2P network, and may be provided relatively low upload bandwidth when they do have access to the P2P network, it may be difficult for such devices to successfully perform complete backups into the P2P network. Furthermore, some existing P2P backup techniques may require participating devices to reserve substantial amounts of storage (often several times larger than the expected amount of data to be backed up from the device) for incoming P2P backup data, which may also place an undue storage burden on the devices.