Copying, moving, and initializing large quantities of data, e.g., 10 or more gigabytes, stored on enterprise storage systems is a common operation. These operations tend to require a long time to complete and impose significant overhead on the enterprise storage systems configured to support databases, email servers, and backup processes. The overhead involved in data copying includes multiple context switches, double copying of data between the kernel and application program, cache pollution, and scheduling of synchronous operations. Consequently, performing the data transfer for a large copy, move, or initialization operation prevents system resources from being used by other more critical tasks. This may limit the performance and scalability of the enterprise storage systems.
Enterprise storage systems employ disk arrays that are physically independent enclosures containing a disk array controller, a disk cache and multiple physical disk drives. The disk array controller manages the physical disk drives and exposes them to connected computer systems as logical data storage units, each identified by a logical unit number (LUN), and enable storage operations such as cloning, snapshotting, mirroring and replication to be carried out on the data storage units using storage hardware.
Computer systems that employ disk arrays are typically configured with a file system that executes a logical volume manager. The logical volume manager is a software or firmware component that organizes a plurality of data storage units into a logical volume. The logical volume is available in the form of a logical device with a contiguous address space on which individual files of a file system are laid out. Logical volume manager and the organization of filed on this logical volume is controlled by the file system. As a result, disk arrays do not know how individual files are laid out on the data storage units. Therefore, a disk array cannot invoke its hardware to carry out storage operations such as cloning, snapshotting, mirroring and replication on a per-file basis.
One possible solution for carrying out storage operations in a disk array on a per-file basis is to add storage metadata in data structures managed by the disk array. Disk arrays, however, are provided by a number of different vendors and storage metadata varies by vendor. This solution is not attractive because the file system would then need to be customized for each different vendor. For this reason, copying (cloning), moving and initialization of files have been typically carried out using software techniques through traditional standard file system calls.