The invention relates to data storage, and particularly to remote copy, between peers, with selective control for non-identical duplication between primary and secondary storage volumes.
Remote copy is a well-known data back-up process, used for example in duplicating the disk volumes of storage system peers. By way of example, a remote copy of IBM""s RAMAC storage system can be made into another RAMAC storage system using remote copy, with synchronization managed by internal intelligence.
Prior art remote copy methods and systems require that the secondary volume be an exact duplicate of the primary volume. If the secondary volume has more cylinders than the primary volume, these additional cylinders are not used. Such systems thus waste storage capacity. This is especially true with respect to application data sets that do not utilize the entire storage volume.
Construction of the virtual volume by using the entire disk volume addressing for prior art remote copy also wastes disk address space. Furthermore, since all writes to a primary volume are reflected in the secondary volume, a performance penalty is realized in implementing both writes before the host is given a device end. Specifically, one cannot currently mix data sets in need of remote copy with data sets that do not, without incurring the write penalty for all data sets.
It is, accordingly, one object of the invention to provide remote copy storage systems and methods without the above-described problems. One specific object of the invention is to provide remote copy between peers and with data extent granularity as opposed to volume granularity. A further object of the invention is to reduce the number of secondary volumes required to implement remote copy, between peers, as compared to the prior art. Yet another object of the invention is to provide flexibility in relocating tracks to secondary volumes during remote copying. These and other objects will become apparent in the description that follows.
U.S. Pat. Nos. 5,615,329, 5,072,378 and 5,193,184 relate to storage systems, remote data duplex and/or virtual data storage, and provide useful background information for the invention. U.S. Pat. Nos. 5,615,329, 5,072,378 and 5,193,184 are thus herein incorporated by reference.
In one aspect, the invention provides a method of remote copying, between peers, with data extent granularity, including the steps of (a) reading a compressed data image from a first location of a primary storage volume; (b) transferring metadata and then the compressed data image to a secondary storage volume, the metadata specifying a first location within the primary storage volume and a second location within the secondary storage volume; and (c) storing the compressed data image into a second location of the secondary volume.
The method can also include the step of specifying data extents within the primary and secondary volumes through a host connected to the primary storage volume. The data extents in this aspect specify the first and second locations.
In yet another aspect, the method includes the step of specifying data extents with cylinder information, track information, and start and end addresses.
In another aspect, the method has the step of utilizing an Establish Pair command in the step of specifying. The Establish Pair command is known to those skilled in the art as an IBM command standard.
In still another aspect, the method includes the step of transferring a seed value with the metadata to identify the data image during subsequent decompression of the data image in the secondary storage volume.
The invention also provides a remote copy system with data extent granularity such as for use in peer-to-peer storage systems. A primary control unit and a primary storage volume store compressed data from a host; and a secondary control unit and a secondary storage volume store selected data from the primary storage volume into selected locations within the secondary storage volume. The primary control unit assigns metadata to the data and transfers the metadata and then the data to the secondary control unit. The metadata specifies (a) a first location associated with the selected data in the primary storage volume and (b) the selected locations within the secondary storage volume.
In another aspect, the primary control unit has a cache for mapping the compressed data into distributed memory of the primary storage volume.
In still another aspect, the secondary control unit has a cache for mapping the selected data into distributed memory of the secondary storage volume.
The system of the invention can further include a host, connected to the primary control unit, to command the remote copying between the primary and secondary control units. By way of example, the host can be used to specify data extents within the primary and secondary storage volumes. These data extents preferably include cylinder information, track information, and start and end addresses.
In yet another aspect, the primary control unit includes means for specifying a seed value with the metadata to maintain data integrity during subsequent decompression of the selected data.
The invention is next described further in connection with preferred embodiments, and it will become apparent that various additions, subtractions, and modifications can be made by those skilled in the art without departing from the scope of the invention.