Computer systems generally include one or more processors interfaced to a temporary data storage device such as a memory device and one or more persistent data storage devices such as disk drives. Data objects are stored on one or more of these disk drives. Groups of data objects will typically represent a table and the table will have associated with it metadata including a table header and the individual addresses of the data objects that belong to that table. In a distributed database the data objects of a table will be stored on different disk drives.
It is often necessary to create a copy of a table or group of data objects. This copy of data objects would be kept separate from the original data and could be modified separately from the original data.
In a traditional system, where a copy of the data objects is required, an entire copy of the table metadata and individual data objects is made. It is generally a requirement that during this copy operation, the original data is not modified. The original data must be placed in a consistent state until such time as an entire copy of the data has been made after which the original data and the copy can be taken out of the consistent state. Methods for ensuring a consistent state include locking the data objects, quiescing applications or taking applications offline. It will be appreciated that write operations to the data are unavailable during a copy operation and in this traditional system where an entire copy of the data is made, this data will have the longest data unavailability or write latency.
One solution to this problem is known as snap imaging of a data object. During a snap imaging operation, physical copies of the data objects are not created. The original data is placed in a consistent state and a copy of only the metadata associated with the original data objects is made instead of actual copies of the data objects. No physical copies of the data objects are created at the time of snap imaging. Once the data is unlocked or taken out of a consistent state, normal read operations and write operations involving the original data or the snap image may occur.
A read operation of the original data does not require any special handling. A read operation of the snap image data is logically directed at the snap image but physically accesses the original data. It will be envisaged that such read operations are only directed at snap image data that has not been the subject of a write operation since the snap imaging operation.
Write operations to an original data object following a snap image operation are delayed while the system performs a “copy on write” operation that physically creates a copy of the data object. It is only after a copy of the data object has been made that the write operation to the original object will then proceed. Further logical read and write operations involving the data object will go to the modified copy of the data object.
In the same way, logical write operations involving the snap image of the object are delayed while the system performs a “copy on write” operation that physically creates a copy of the data object. The write operation to the imaged object will then proceed. Further logical read and write operations to the snap image involving the newly copied data object will go to this modified copy of the data object.
The snap image technique has one benefit over making a traditional copy in that the physical creation of copies of data objects is deferred until such time as the data objects need to be written to for the first time. This means that there is no initial delay while an entire copy of the data objects is made. However there is still a problem with snap imaging in that an application will suffer increased response time whenever a portion of the original data or the image data is written for the first time since the snap operation and the system performs a copy on write operation to instantiate the physical copy of that portion of the image.