1. Field of the Invention
This invention is related to the field of computer systems and, more particularly, to the recovery of catalog metadata associated with archived data.
2. Description of the Related Art
Enterprise computing systems commonly use configurations such as the storage area network (SAN), network attached storage (NAS), and other centralized storage mechanisms to simplify storage, improve availability, and handle escalating demands for data and applications. The SAN model places storage on its own dedicated network. This dedicated network most commonly uses Fibre Channel technology as a versatile, high-speed transport. The SAN may include one or more storage hosts that provide a point of interface with local area network (LAN) users and may also include one or more fabric switches, SAN hubs, and/or other intermediate entities to accommodate a large number of storage devices. The hardware (e.g., switches, hubs, bridges, routers, cables, etc.) that connects servers to storage devices in a SAN is referred to as a “disk fabric” or “fabric.” The SAN fabric may enable server-to-storage device connectivity through Fibre Channel switching technology to a wide range of servers and storage devices.
The SAN and other centralized storage mechanisms may be used to implement backup solutions in enterprise environments. Tape devices have traditionally been used as a high-capacity backup medium. Some backup environments may use available disk-based storage (e.g., in a SAN) for backup, either as a final backup destination or as an intermediate location for staging the data to tape. A software-based backup solution such as NetBackup™ from Symantec Corporation may permit clients to archive data to storage devices in a networked backup environment. In a backup solution such as NetBackup™, metadata associated with the archived data is typically stored in a catalog.
Data archived using a backup solution such as NetBackup™ can be replicated to a disaster recovery site for an additional level of security. The disaster recovery site is often at a remote location relative to the primary site. To import the replicated archived data into another instance of the backup solution at the disaster recovery site, the catalog for the archived data must typically be rebuilt by reading the entire replicated archive to locate and process the metadata (e.g., .tar headers). However, this process may be undesirably slow.
Alternatively, the entire catalog may be replicated from the primary site along with the archived data. Because the catalog typically stores metadata for with a superset of the archived data, replication of the entire catalog may result in the replication of unnecessary amounts of metadata. Furthermore, if the disaster recovery site is used to maintain archived data from multiple primary sites, management of the multiple sets of archived data with a single instance of the backup solution may preclude full catalog replication from a single one of the primary sites.