Copying a file is a fairly common operation on a single server equipped with its own data storage. As the size of a file increases, so too does the time to copy that file. Copying a file involves allocating enough storage on some disk storage device to accommodate all the data in the file being copied and then copying the data itself to the allocated storage on disk. Since allocating all the storage up front for a very large file takes a fair amount of time, many file systems allocate storage on demand as the data is being written to the storage device. The time to copy also increases when the data to be copied is transferred over a network to a different storage device because the transfer time over a network needs to be taken into account. Finally, the task of copying very large files imposes demands on the server's hardware resources such as CPU and memory.
In the world of virtual machines where a number of virtual machines each with its own guest operating system may execute concurrently on a single server, the server's hardware resources such as CPU and memory are apportioned amongst the virtual machines. The server's resources are taxed even more, because copying a typical virtual machine disk image can take hundreds, if not thousands, of seconds. The task of copying such a disk image file places significant additional burden on a single server's hardware resources, including CPU cycles, memory for copy buffers, host bus adaptor queue slots, and network bandwidth.
Even in a cluster of virtual machines running on multiple server systems that share a common file system, the process of copying a file from a source storage device to a destination storage device is a serialized process. For very large files, this serialized procedure is very inefficient.