1. Field of the Invention
This invention is related to the field of computer systems and, more particularly, to network-based backup of storage in computer systems.
2. Description of the Related Art
Computer systems and their components are subject to various failures which may result in the loss of data. For example, a storage device used in or by the computer system may experience a failure (e.g., mechanical, electrical, magnetic, etc.) which may make any data stored on that storage device unreadable. Erroneous software or hardware operation may corrupt the data stored on a storage device and effectively destroy the data stored on an otherwise properly functioning storage device. Any component in the storage chain between (and including) the storage device and the computer system may experience failure (e.g., the storage device, connectors [e.g., cables] between the storage device and other circuitry, the network between the storage device and the accessing computer system [in some cases], etc.).
To mitigate the risk of losing data, computer system users typically make backup copies of data stored on various storage devices. Typically, backup software is installed on a computer system, and the backup may be scheduled to occur periodically and automatically. Backups may also be initiated manually by a user or administrator of the computer system. Therefore, a primary goal of enterprise storage management is the backup and restoration of information in an intelligent, secure, timely, cost-effective manner over all enterprise-wide operating systems.
The Network Data Management Protocol (NDMP) is an open protocol for enterprise-wide network based backup. NDMP is a network-based protocol that can be used for communications by centralized backup applications and agents on file servers. NDMP meets the strategic need to centrally manage and control distributed data while minimizing network traffic. NDMP, as an embedded protocol, separates the data path and the control path, so that network data can be backed up locally yet managed from a central location. NDMP allows administrators to back up critical data using any combination of compliant network-attached servers, backup devices, and management applications. The NDMP architecture allows network-attached storage vendors to ship NDMP-compliant file servers which can be used by any NDMP-compliant backup administration application. This same architecture may also be used for network-attached backup devices, such as tape drives and tape libraries.
An enterprise-wide backup may be a complex procedure including numerous elements. For example, the data to be backed up must be defined. Complex interactions with the backup media device and extensive cataloguing and control must be managed. The backup should also assure data protection and efficient restoration of mission-critical data in the event of data loss. These elements may require data flows across various hosts, clients, and backup devices in the enterprise. NDMP defines common functional interfaces used for these data flows. For example, file system data flows from the file system to the backup device may use a common interface, regardless of the platform or device. Control or file meta-data may be passed to and from the backup software using common interfaces, regardless of the software package.
One of these data flows may include a stream of file history information. The file history information may include two message types: DIR messages including hierarchy information such as the name of the node, node identification (e.g., node number), and parent node identification (e.g., parent node number) of a specific node (i.e., a directory or file); and NODE messages including other attribute information (e.g., permissions, creation and modification dates, and other meta-data) for a specific node. These messages may represent both directory and leaf nodes of a file system, and they may arrive in random order. Therefore, one cannot make assumptions about the order of the DIR and NODE messages based on the order of receipt. Consequently, when a DIR message arrives, it may not be known whether the message represents a directory or a file, and it therefore is difficult to establish the correct relationship among all directories and files in an efficient manner. Typically, the message-processing software must perform multiple passes on the messages in order to establish the correct relationship.
Therefore, an improved system and method for performing network-based backup are desired.