1. Technical Field
The present invention relates in general to data processing systems and more particularly to better utilization of memory resources and of segment metadata nodes in data snapshots in such systems.
2. Description of the Related Art
A snapshot of data in a data processing system at a time xe2x80x9ctxe2x80x9d creates, in a target data volume, a logical copy of data in a source data volume. Physical copying of the data from the source volume to the target volume can then subsequently take place, with any intervening changes (xe2x80x9cwritesxe2x80x9d) to data in the source volume being momentarily delayed. During this momentary delay, the original version of the data sought to be changed is preferentially copied from the source volume to the target volume, prior to writing the change. Thus, the snapshot of data in the target volume represents the exact state of the data in the source volume at the time xe2x80x9ct.xe2x80x9d
Snapshots as defined above are useful for backing up data and for testing. For example, taking a snapshot of frequently changing data facilitates the execution of test applications against the snapshot of the data, without changes to the data unduly interfering with the test application execution. Moreover, the snapshot mechanism facilitates faster data backups by a storage subsystem as compared to file system-based backups, which entail host CPU processing and which require the allocation of relatively high network bandwidth.
Existing snapshot systems are, however, unduly restrictive. Most, for instance, permit write access only to the source volume in order to coordinate data in the system. Further, the limitations of existing snapshot systems prohibit the undertaking of concurrent snapshots or of distributed snapshots, and they do not support cyclical and transitive snapshot operations. Concurrent snapshots, distributed snapshots, and cyclical and transitive snapshot operations can be very useful for test purposes. Moreover, existing systems fail to account for the above-recognized considerations. In view of this, the costs of reads and writes are not optimized in existing snapshot systems in the case of multiple storage volumes that are involved in multiple concurrent snapshot operations.
A typical data snapshot management system needs to record persistently (as long as the snapshot relationship between source and target data volumes is active) the metadata segments that carry information about where to get the t0 data from. In practical systems where this is implemented, the metadata segments consume large amounts of a valuable resource, either non-volatile random access memory (or NVRAM) space, or storage on drives. This imposes a limitation on how much of such metadata segments can be maintained through the backup creation. Owing to this limitation, a snapshot system cannot handle a specific pattern of writes that consumes a large or unlimited number of metadata segments.
An example system where the above-identified problems may be encountered occurs in systems where the source and target volumes are made available through most of the backup operation. These systems are described, for example, in a co-pending, commonly owned U.S. patent application: xe2x80x9cSystem and Method for Concurrent Distributed Snapshot Managementxe2x80x9d, Ser. No. 09/376,832, filed Aug. 18, 1999, (filed as IBM Case No. AM9-99-052).
In data processing systems, certain system interfaces permit techniques which allow the formation in memory of what are known as sparse files. Files are created having lengths greater than the data they actually contain, leaving empty spaces for future addition of data. Data is written in relatively small portions into a number of memory locations which are not contiguous. Certain portions of the computer memory in the area of these memory locations, however, never have data written in them, although other memory files receive data. Data written into sparse files is known as sparse data. Snapshot systems when sparse data is present have been a problem, in that they rapidly consume large numbers of metadata segments and memory resources.
It would be desirable to have an ability to have continuing records available about metadata segments in data processing systems while not consuming memory resources of the data processing system.
It is an object of the present invention to provide a data processing system and method of maintaining usage data about data snapshots of data write operations to storage media of the data processing system and keep record of data overwrites without unduly consuming memory resources of the data processing system.
It is a further object of the present invention to provide a computer program product enabling a data processing system to maintain usage data about data snapshots of data write operations to storage media of the data processing system and keep record of data overwrites without unduly consuming memory resources of the data processing system.
It is still a further object of the present invention to provide a memory product stored in a memory of a data processing system to better utilize memory resources of the data processing system, particularly those relating to usage metadata nodes.
The above as well as additional objectives, features, and advantages of the present invention will become apparent in the following detailed written description.