The present invention relates to data storage systems, and more particularly, this invention relates to setting optimal space allocation policy for creating dependent snapshots to enhance application WRITE performance and reduce resource usage.
Block virtualization solutions like host-based volume managers, e.g., logical volume manager (LVM), storage area network (SAN) virtualizers (e.g., IBM SAN Volume Controller), etc., provide volume snapshot capability. Copy on Write (COW) snapshots involve creation of dependent virtual disks (snapshots). COW snapshots are dependent on the original volume for all or part of their data storage. Initially, both the original volume and the snapshot volume point to the same data on the underlying storage. New physical space is allocated for the snapshot volume only when an application modifies data on the original volume and there is a need to copy old data from the original volume to the snapshot volume (to preserve a copy of the old data). Typically, block virtualization solutions use the COW technique for copying original data from a parent volume to a dependent volume while processing application WRITE operations on the original volume. The COW operation typically has the following steps: 1) hold application WRITE data in a memory buffer; 2) READ old data from the original volume into RAM; 3) WRITE old data from RAM to the snapshot volume (after new physical storage space has been allocated for the snapshot volume to hold the old data); and 4) allow WRITE data (held in step #1) to be written to the original volume.
It can be seen from the above described process that the COW operation is resource intensive because it requires additional memory and SAN usage due to internal READ and WRITE operations generated in addition to the application WRITE. Additionally, a COW operation increases write latency as it is synchronously performed in application I/O context, i.e., the original application WRITE I/O is held until the COW operation has been completed. The above two problems present a serious hindrance to using COW snapshots with WRITE operations and also to making the operation less resource intensive for the virtualization software being used (e.g., LVM, IBM SAN Volume Controller, etc.)
Most existing solutions perform the COW operation according to the steps described above and therefore suffer from WRITE latency issues when using COW snapshots. A minority of virtualization solutions advocate the use of the XCOPY SCSI operation to ensure that the COW operation is less resource intensive for the virtualization software. However, XCOPY is not a mandatory SCSI command which results in it often not being supported, and even if it is supported, only a subset of the whole XCOPY command is usually supported. Also, there is no explicit attempt made to locate the snapshot and original volumes on the same disk array by existing solutions. That means that even if the XCOPY command is used, data is copied across disk arrays, thus making it a relatively more time consuming operation.