1. Technical Field
This application relates to computer storage devices, and more particularly to the field of transferring data between storage devices.
2. Description of Related Art
Computer systems may include different resources used by one or more host processors. Resources and host processors in a computer system may be interconnected by one or more communication connections. These resources may include, for example, data storage devices such as the Symmetrix™ family of data storage systems manufactured by EMC Corporation. These data storage systems may be coupled to one or more host processors and provide storage services to each host processor. An example data storage system may include one or more data storage devices, such as those of the Symmetrix™ family, that are connected together and may be used to provide common data storage for one or more host processors in a computer system.
A host processor may perform a variety of data processing tasks and operations using the data storage system. For example, a host processor may perform basic system I/O operations in connection with data requests, such as data read and write operations.
Host processor systems may store and retrieve data using a storage device containing a plurality of host interface units, disk drives, and disk interface units. Such storage devices are provided, for example, by EMC Corporation of Hopkinton, Mass. and disclosed in U.S. Pat. No. 5,206,939 to Yanai et al., U.S. Pat. No. 5,778,394 to Galtzur et al., U.S. Pat. No. 5,845,147 to Vishlitzky et al., and U.S. Pat. No. 5,857,208 to Ofek. The host systems access the storage device through a plurality of channels provided therewith. Host systems provide data and access control information through the channels to the storage device and storage device provides data to the host systems also through the channels. The host systems do not address the disk drives of the storage device directly, but rather, access what appears to the host systems as a plurality of logical disk units. The logical disk units may or may not correspond to the actual disk drives. Allowing multiple host systems to access the single storage device unit allows the host systems to share data stored therein.
A host may issue a request to make a point in time copy or “snapshot” of a data set, such as logical disk unit or file. One existing technique includes making a complete physical copy of the data included in the data set. In order to make a complete copy, no further data modification to the data set, such as in connection with a write operation, can be performed prior to copying the data in the data set to the snapshot copy. The foregoing may not be desirable for use in instances where the data set being copied may also be available on-line for use in connection with I/O operations prior to making a complete snapshot copy.
Another way of making a snapshot copy of a data set uses a “copy on first write” technique. In this technique, storage is allocated for use as a snap data area for storing the existing or old data. When a write request is received to modify a storage location in the data set, the existing data at the storage location to be modified is first read and copied into the snap data area. The existing data set is then updated in accordance with the write operation. One problem that may result with this technique is the fragmentation of the snap data area since storage is allocated and used in accordance with each write operation. It may be difficult to use an efficient coalescing technique where multiple snap data area entries associated with consecutively located data portions are combined into a single entry since this may require a large number of I/O operations. Additionally, data structures used in managing the allocation of the snap data area may be complex as a result of large numbers of I/O operations causing large numbers of snap data area entries.
Yet another technique may include initially allocating storage for an entire data volume or data set for which a point in time copy is being made in response to a request for a snapshot copy. However, allocating such large amounts of storage can cause inefficient use of space if snapshots are performed frequently.
Thus, it is desirable, in a number of circumstances, to use a technique for creating a point in time copy or snapshot of a data set that overcomes one or more drawbacks of the existing techniques. It is also desirable to use a technique that is space efficient, reduces fragmentation associated with storage areas and management data structures, and also has a low latency associated with an I/O operation.