Disks, by their nature, are more efficient at sequential, localized transfers than at small, random transfers. A constant challenge in the disk storage industry is to develop a system, e.g., a storage system, which can perform both random write operations and sequential read operations, efficiently. As used herein, a storage system is a computer that provides storage services relating to the organization of information on writeable persistent storage, such as non-volatile memories and disks. The storage system may include a storage operating system that implements a virtualization system to logically organize the information as a hierarchical structure of data containers, such as files and logical units (luns), on, e.g., one or more arrays of disks. Each “on-disk” data container may be implemented as set of data structures, e.g., disk blocks, configured to store information, such as the actual data for the container.
The virtualization system of the storage system may be abstracted through the use of a database management system, a volume manager or a file system. A conventional log-structured file system such as, e.g., a write anywhere file system, can convert a random stream of write operations, e.g., write data, into sequential disk transfers, but in the process, can randomize locations of blocks on disk and make subsequent sequential read operations generally inefficient. On the other hand, a conventional disk array approach, such as a standard Redundant Array of Independent (or Inexpensive) Disks (RAID), typically employs a static layout that maps externally-received, sequential addresses into sequential locations on disk. This approach provides good sequential read performance, but poor random write performance.
Conventional disk array systems often compensate for poor write performance by implementing large write buffers. These write buffers are typically implemented in non-volatile memory, given its persistency and ability to maintain write data (updates) in light of a system failure. With sufficiently large write buffers, these systems can achieve higher performance by optimizing the sequence of disk updates across a large pool of potential write “candidates”. However, the relative expense of maintaining large write buffers and protecting them against data loss due to system or power failures limits the size of the buffers and the efficiency gains that can be achieved.