1. Field of the Invention
This invention relates to computer systems and, more particularly, to the backup of data used by distributed applications running on a network of computer systems.
2. Description of the Related Art
It has become common for organizations to employ distributed applications installed on a network of computer hosts to manage a wide variety of information that may be critical to their operations. For example, Microsoft Exchange Servers provides an organization with a messaging (e-mail) and collaboration environment. Another example, Microsoft SharePoint® Portal Server, provides a unified view (on a website) of information from various applications. A third example, Microsoft's Distributed File System (DFS) is a distributed application that provides a single name space for combining the views of files on multiple, networked computers into a single, hierarchical view. Additional examples of distributed applications, available from various vendors, are well known to those having ordinary skill in the art.
In order for a distributed application to provide its desired functionality, a set of distributed data sources is generally associated with the distributed application. For example, a distributed application may have access to one or more database repositories, file systems, or other storage media, either local or remote. Generally, a variety of design decisions determine the number and location of data sources associated with a given distributed application. Such decisions may include the quantity of data stored, required frequency of access to the data, network latency between hosts on which the data is stored, and the functionality required of the application software installed on each host.
In addition to the above, distributed applications may utilize a plurality of servers and a plurality of data sources. In such a case, a server typically implements some portion of the functionality of the distributed application. A server may also manage data that is required to provide the functionality it implements and/or functionality implemented by other servers. For example, one type of server may provide services that require access to data stored in data sources and managed by other servers installed on other hosts. In addition, another type of server may manage a data source for other servers residing on other hosts. In general, a distributed application may comprise a plurality of both types of servers. It is also possible for a server to function as both types of servers at the same time.
There may also be associated information, which may be referred to as “metadata”, that is stored in a data source on a different host and is required for a server to make use of the data stored on its host. For example, the data in a data source may be encrypted and encryption keys that are needed to decrypt that specific data may be stored in a data source located on a different host. Another example of metadata is a table of user capabilities that may determine what operations each user is permitted to perform on the data of a given data source. The existence of metadata that is associated with data sources results in dependencies between data sources that must be dealt with during backup and restore operations of a distributed application. During a system-wide backup operation, data from multiple data sources, including metadata, may be copied and stored on backup media. It is common for a distributed application to have large amounts of data to be backed up such that multiple backup tapes or other media may be required in order to hold all of the data from multiple data sources. A time consuming system-wide restoration may be required to restore all of the metadata necessary to make even a selected portion of the backup data useable.
In order to avoid the loss of important data associated with an organization's distributed applications, a data protection application may be employed to manage backup and restore operations for the data and its associated metadata. It is often desirable to restore data to a selected portion of the data sources associated with a distributed application, for example, in the event of the failure of a single host. However, in order to restore selected data to a useable state, it may be necessary to restore metadata associated with the data as well. Unfortunately, the metadata of interest may not be stored in the same place as the selected portion of the data, making a selective restoration operation complex and inefficient.
In view of the above, an effective system and method for backup and restore of distributed application data is desired.