The present invention relates to computer disk storage. More specifically, the invention relates to the creation and maintenance of logical volumes used in system crash recovery and the like.
A xe2x80x9csnapshotxe2x80x9d is essentially a logical copy of the information stored on a volume at a particular instant in time. A snapshot may be used as a backup copy of the volume, but is much faster to create than a full backup. For the purpose of this discussion, a xe2x80x9cbase volumexe2x80x9d is the actual volume of which the snapshot was taken. A snapshot system uses a differential file to track the changes written to the base volume after the snapshot is captured. When a change is written to an allocation unit (e.g. a cluster) on the base volume, the old data is copied from the allocation unit to the differential file before writing the new data. This method is often referred to as xe2x80x9ccopy-on-write.xe2x80x9d In this way, the state of the base volume at the time the snapshot was taken is accessible by reading data from the base volume in conjunction with any data stored in the differential file associated with the snapshot.
Snapshots may be taken of the base volume at different times. Existing snapshot mechanisms accomplish that by maintaining each differential file separately. Any change to the base volume results in each of the multiple differential files being updated as necessary to reflect the change. As a result, each differential file is provided with sufficient unused space to accept data representative of later writes to the base volume. However, separately maintaining differential files has a number of drawbacks, including reserving large amounts of space for each differential file, the need to check each differential file on each potential copy-on-write operation, the duplication of data across multiple differential files, and the performance degradation associated with actively maintaining multiple differential files.
An efficient mechanism for maintaining multiple snapshots of the same volume taken at different times has eluded those skilled in the art.
The present invention provides a mechanism for more efficiently maintaining multiple temporal snapshots of a common base volume. When the base volume is modified, such as when existing data is overwritten with new data, that modification may affect two or more of the snapshots. In accordance with the present invention, before the modification, the existing data is copied only to the differential file associated with the latest snapshot, whereby only that differential file need be actively maintained. By reading data from the snapshots through a method enabled through the present invention, described below, the old data need not be replicated in multiple differential files, which avoids various problems of the prior art.
When a region of a selected snapshot is read, the mechanism of the present invention reads the region from the selected snapshot. The selected snapshot may be any snapshot in a set of snapshots of the base volume taken at different times. The region may be any portion of the selected snapshot, such as a single cluster, a file or files, each comprising a set of clusters (which may be physically discontiguous), or it may be the entire selected snapshot volume. xe2x80x9cReading the snapshotxe2x80x9d essentially occurs by first determining whether data associated with the region is stored in the selected snapshot""s associated differential file. For instance, if new data was to be written to the base volume over existing data while a given snapshot was the most recent snapshot, the existing data is first stored in the differential file of that snapshot before the new data is written. If the existing data is in the differential file, that data is returned to the reading process. Data stored in the differential file may correspond to none, all, or only part (e.g., an allocation unit) of a given region. For instance, if less than the entire region (e.g., one of eight allocation units) was overwritten on the base volume, then only the overwritten part of the region may reside in the differential file. If later snapshots have been taken, data associated with other parts of the region may be stored in one or more of the later differential files.
If the differential file of the selected snapshot does not have data for each portion of the requested region, the mechanism continues by accessing each differential file associated with subsequent snapshots in temporal order from the earliest following the selected snapshot to the latest until either the region is complete or no later snapshots remain. During each read, the mechanism looks for data that was written to that differential file because of a change to the region on the base volume. The earliest of this data, if any, is kept as corresponding to the selected snapshot. In other words, if the selected differential file examined contains data for less than the entire region (including none of the region), then the next-latest differential file is examined for more data for the rest of the region (that part which has not already been found). If any part of the region is still missing, the next differential file is accessed to look for the missing part, and so on, until either the region is complete or no more snapshots remain. Finally, if any part of the region was not filled in with data from one of the differential files, then that part of the region is read from the base volume. The data accumulated from the various differential files or the base volume is returned to the process that requested the read of the selected snapshot. It will be appreciated that once a later snapshot is captured, the invention allows the differential file for any prior snapshot to be fixed, i.e., no longer updated, whereby unused space previously allocated to an earlier differential file may be reclaimed. Moreover, the deletion of the oldest snapshot merely involves deleting its differential file, which is very fast and efficient.