A storage server is a special-purpose processing system used to store and retrieve data on behalf of one or more client processing systems (“clients”). A storage server can be used for many different purposes, such as to provide multiple users with access to shared data or to backup mission critical data.
A storage server includes a storage operating system that logically organizes sets of data blocks stored on mass storage devices, such as magnetic or optical storage based disks or tapes. The mass storage devices may be organized into one or more volumes of Redundant Array of Inexpensive Disks (RAID). In a block-based deployment, such as a conventional storage area network (SAN), client requests can directly address specific data blocks in the storage server, thus providing block-level access. In a file-based deployment, such as a network attached storage (NAS) environment, the operating system implements a file system to logically organize the data blocks as a hierarchical structure of addressable files and directories on the disks, thus providing file-level access.
A file system assigns each file a sequence of consecutively numbered file block number (FBNs), which are associated with volume block numbers (VBNs). The volume block numbers (VBNs), which may or may not be consecutively numbered, typically have a one-to-one mapping to on-disk data blocks, which are assigned disk block numbers (DBNs).
A read stream is defined as a set of one or more client requests that instructs the storage server to retrieve data from a logically contiguous range of FBNs within a requested file. Accordingly, a read stream may be construed by the file system as a sequence of client requests that directs the file system to retrieve a sequence of data blocks assigned to consecutively numbered FBNs. Client requests in the read stream may employ file-based or block-based semantics, so long as they instruct the storage server to retrieve data from the stream's logically contiguous range of FBNs.
When a request in a read stream is received by a storage server, the request may direct the storage server to retrieve a list of data blocks (i.e. VBNs) assigned to consecutive numbered FBNs. However, as suggested above, although the FBNs may be consecutive, the list of VBNs may or may not be consecutive (or sequential). When a new file or logical unit number (LUN) (i.e. the address assigned to each storage device in a block-based server) is written, the VBNs are typically adjacent on disk, and therefore, the list of VBNs is sequential, e.g. VBN 1, 2, 3, 4, 5. When the list is sequential, the file system in the storage server can satisfy the request by issuing one (or a few) commands that cover ranges of VBNs. For example, when the request is to read VBN 1, 2, 3, 4, 5, the file system can issue a command to a storage subsystem to read VBN 1-5, rather than issuing five separate commands (e.g. read VBN 1, read VBN 2, read VBN 3, read VBN 4, read VBN 5).
However, when the list is non-sequential, the file system conventionally satisfies the request by issuing several small commands. For example, when the request is to read VBN, 1, 100, 3, 101, 5, the file system conventionally issues five separate messages to a storage subsystem: read VBN 1, read VBN 100, read VBN 3, read VBN 10, and read VBN 5. Issuing these small multiple read commands reduces the performance of the storage server.
Conventional solutions to improve performance attempt to prevent the data block (i.e. VBNs) list from containing non-sequential VBNs. For example, one solution attempts to defragment a disk so that data blocks associated with a single file are sequential again. However, defragmenting a disk is often time-consuming and may prevent access to the disk for a prohibitive length of time. Defragmentation also fails to provide real-time improvements in system performance since it is often scheduled to occur at a certain time or after a disk has reached a certain amount of fragmentation.
Another conventional solution attempts to prevent the data block (i.e. VBNs) list from containing non-sequential VBNs by writing modified data blocks to the same VBNs, rather than to a different location. However, file systems implementing this solution may have a larger write operation overhead than file systems that allow modified data blocks to be written to different locations, e.g. the write-out-of-place design. For example, when operated with a RAID array, the write-out-of-place design schedules multiple writes to the same RAID stripe whenever possible. This scheduling reduces write operation overhead by avoiding updating only one block in a stripe when possible. This reduction in write operation overhead is lost when a file system writes to the same VBNs to avoid fragmenting a file or LUN.
Therefore, what is needed is a technique for improving multi-block reads that overcomes the shortcomings of the above-mentioned approaches.