Various forms of network-based storage systems exist today. These forms include network attached storage (NAS), storage area networks (SAN's), and others. Network-based storage systems are commonly used for a variety of purposes, such as providing multiple users with access to shared data, backing up critical data (e.g., by data mirroring), etc.
A network-based storage system typically includes at least one storage server, which is a processing system configured to store and retrieve data on behalf of one or more client processing systems (clients). The files may be stored in a storage system that includes one or more arrays of mass storage devices, such as magnetic or optical disks or tapes, by using a data storage scheme such as Redundant Array of Inexpensive Disks (RAID). In a SAN context, a storage server provides clients with block-level access to stored data, rather than file-level access. Some storage servers are capable of providing clients with both file-level access and block-level access, such as certain storage servers made by NetApp, Inc. (NetApp®) of Sunnyvale, Calif.
Client devices may implement a hypervisor software layer. A hypervisor software layer, also referred to as a virtual machine monitor, allows the client processing system to run multiple virtual machines. A virtual machine is a software implementation of a machine (i.e. a computer) that executes instructions like a physical machine (e.g., different operating systems, different instances of the same operating system, or other software implementations that appear as “different machines” within a single computer). Additionally, data managed by a storage server on behalf of multiple client devices, multiple client virtual machines within one or more client machines, or multiple portions of a single client virtual machine may be stored within a single storage container or volume (e.g., a LUN (Logical Unit Number), large NFS (Network File System) file, or another equivalent logical division of storage).
A storage system often uses a fixed block size for all internal operations. For example, WAFL (Write Anywhere File Layout) uses 4 KB (4096 bytes) blocks for all operations, as do client-side file systems, such as NTFS (New Technology File System) and ext4fs (fourth extended filesystem). Since file systems usually start individual files on block boundaries, application writers take advantage of a file system's block size and alignment to increase the performance of their input/output (“I/O”) operations—for example, always performing I/O operations that are a multiple of 4 KB, and always aligning these operations to the beginning of a file. Other file systems or applications, however, may use block boundaries of a different size (e.g., 512 bytes).
When the block boundaries of a client-side file system and a server-side file system do not match, I/O operations may result in a “partial write,” e.g., writing to only a portion of a storage system block. Partial writes can result from the starting block of a client-side file becoming unaligned from the server-side container blocks. Accordingly, data stored on behalf of a client that could fit within a single server-side container file block may end up overlapping more than one server-side container file block. When the overlapping data is accessed via an I/O operation, it requires reading two server-side container file blocks rather than just one. A write request to the unaligned client-side file system block that spans two server-side container file blocks (e.g., overwriting the overlapping data) includes preserving the contents of the server-side container file blocks that are not being overwritten. As a result, the two server-side container file blocks are read (e.g., into one or more buffers), the client-side block is written into corresponding portions of the buffered server-side container file blocks, preserving the contents that is not being overwritten, and the updated server-side container file blocks are written back to the storage system. In contrast, the identical client-side write operation would include a single write operation for blocks that are aligned.
One solution to the problem of unaligned I/O operations is to obtain cooperation from the client-side file system to ensure that there is alignment in all cases. There are two problems, however, with the cooperative approach: 1) cooperation is usually performed within the context of the granularity of a single storage-system volume, which is not helpful if more than one virtual machine file system is stored within a single volume because each virtual machine may have a different misalignment; and 2) if misalignment is discovered after the volume has been filled with data, all of the data has to be moved from a misaligned position into an aligned position, which can be a very computationally expensive operation (e.g., in terms of I/O operations) for both the storage server and the client.