Snapshot copies of a data set such as a file or storage volume have been used for a variety of data processing and storage management functions such as storage backup, transaction processing, and software debugging.
A known way of making a snapshot copy is to respond to a snapshot copy request by invoking a task that copies data from a production data set to a snapshot copy data set. A host processor, however, cannot write new data to a storage location in the production data set until the original contents of the storage location have been copied to the snapshot copy data set.
Another way of making a snapshot copy of a data set is to allocate storage to modified versions of physical storage units, and to retain the original versions of the physical storage units as a snapshot copy. Whenever the host writes new data to a storage location in a production data set, the original data is read from the storage location containing the most current version, modified, and written to a different storage location. This is known in the art as a “log structured file” approach. See, for example, Douglis et al. “Log Structured File Systems,” COMPCON 89 Proceedings, Feb. 27-Mar. 3, 1989, IEEE Computer Society, p. 124-129, incorporated herein by reference, and Rosenblum et al., “The Design and Implementation of a Log-Structured File System,” ACM Transactions on Computer Systems, Vol. 1, February 1992, p. 26-52, incorporated herein by reference.
Yet another way of making a snapshot copy is for a data storage system to respond to a host request to write to a storage location of the production data set by checking whether or not the storage location has been modified since the time when the snapshot copy was created. Upon finding that the storage location of the production data set has not been modified, the data storage system copies the data from the storage location of the production data set to an allocated storage location of the snapshot copy. After copying data from the storage location of the production data set to the allocated storage location of the snapshot copy, the write operation is performed upon the storage location of the production data set. For example, as described in Keedem U.S. Pat. No. 6,076,148 issued Jun. 13, 2000, assigned to EMC Corporation, and incorporated herein by reference, the data storage system allocates to the snapshot copy a bit map to indicate storage locations in the production data set that have been modified. In this fashion, a host write operation upon a storage location being backed up need not be delayed until original data in the storage location is written to secondary storage.
Backup and restore services are a conventional way of reducing the impact of data loss from the network storage. To be effective, however, the data should be backed up frequently, and the data should be restored rapidly from backup after the storage system failure. As the amount of storage on the network increases, it is more difficult to maintain the frequency of the data backups, and to restore the data rapidly after a storage system failure.
In the data storage industry, an open standard network backup protocol has been defined to provide centrally managed, enterprise-wide data protection for the user in a heterogeneous environment. The standard is called the Network Data Management Protocol (NDMP). NDMP facilitates the partitioning of the backup problem between backup software vendors, server vendors, and network-attached storage vendors in such a way as to minimize the amount of host software for backup. The current state of development of NDMP can be found at the Internet site for the NDMP organization. Details of NDMP are set out in the Internet Draft Document by R. Stager and D. Hitz entitled “Network Data Management Protocol” document version 2.1.7 (last update Oct. 12, 1999) incorporated herein by reference.