The importance of data has increased during the last decade while the cost of data storage medium has decreased, thus motivating data storage vendors to provide data protection schemes that are based upon duplication of data.
Point in Time (PiT) copies are used mainly to set aside consistent sets of data for recovery. Many customers require having multiple copies of a single production data entity, usually taken periodically, e.g., every hour.
Typically, different PiT copies are referred to as different generation copies. One of the most prevalent uses of PiT copies is for volumes in storage controllers. To simplify the description, volumes will be used to represent data entities, and tracks will be used to present predefined parts of data entities.
PiT volumes are readable, and writeable. This raises she issue of maintaining the right data for every volume. For example, a write operation on the source volume at time T should be reflected in all the PiT targets that were created after T, but not in those that were created before. A write operation on any target volume should not be reflected in any other target.
There are two extremes for maintaining the data to ensure correctness after write operations. One extreme is to copy data, or references to it to all the targets that have to reflect it. For example, before destaging a track in cache that was modified at a certain point in time t, the version of the track on the disk is destaged to all the PiT volumes that were created before that certain point in time and that do not have their data locally. This policy may become very expensive during Write operations, but results in very fast read operations.
The other extreme would be to store each track only once, and have a cascade of references from each PiT target to the target created right after it. In such a policy write operations would result in at most one track destage, but read operations would require traversing a long chain of references, resulting in a performance penalty.
Taking into account different work loads, it is impossible to say that one policy is better than another. In an environment with very few read operations on the PiT targets, one would probably prefer fast writes even at the price of slow reads. If there are many reads, slower writes are acceptable.
There is a need to provide an efficient method, system, and computer program product for maintaining group of copies of a data entity.