A storage system is a computer that provides storage service relating to the organization of information on storage devices, such as disks. The storage system includes a storage operating system that logically organizes the information as a set of data blocks stored on the disks. In a block-based deployment, such as a conventional storage area network (SAN), the data blocks may be directly addressed in the storage system. However, in a file-based deployment, such as a network attached storage (NAS) environment, the operating system implements a file system to logically organize the data blocks as a hierarchical structure of addressable files and directories on the disks. In this context, a directory may be implemented as a specially formatted file that stores information about other files and directories.
The storage system may be configured to operate according to a client/server model of information delivery to thereby allow many client systems (clients) to access shared resources, such as files, stored on the storage system. The storage system is typically deployed over a computer network comprising a geographically distributed collection of interconnected communication links, such as Ethernet links, that allow clients to remotely access the shared information (e.g., files) on the storage system. The clients typically communicate with the storage system by exchanging discrete frames or packets of data formatted according to predefined network communication protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP). In this context, a protocol consists of a set of rules defining, how the interconnected computer systems interact with one another.
In a file-based deployment, clients employ a semantic level of access to files and file systems stored on the storage system. For instance, a client may request to retrieve (“read”) or store (“write”) information in a particular file stored on the storage system. Clients typically request the services of the file-based storage system by issuing file-system protocol messages (in the form of packets) formatted according to conventional file-based access protocols, such as the Common Internet File System (CIFS), the Network File System (NFS) and the Direct Access File System (DAFS) protocols. The client requests identify one or more files to be accessed without regard to specific locations, e.g., data blocks, in which the requested data are stored on disk. The storage system converts the received client requests from file-system semantics to corresponding ranges of data blocks on the storage disks. In the case of a client “read” request, data blocks containing the client's requested data are retrieved and the requested data is then returned to the client.
In a block-based deployment, client requests can directly address specific data blocks in the storage system. Some block-based storage systems organize their data blocks in the form of databases, while other block-based systems may store their blocks internally in a file-oriented structure. Where the data is organized as files, a client requesting information maintains its own file mappings and manages file semantics, while its requests (and corresponding responses) to the storage system address the requested information in terms of block addresses on disk. In this manner, the storage bus in the block-based storage system may be viewed as being extended to the remote client systems. This “extended bus” is typically embodied as Fibre Channel (FC) or Ethernet media adapted to operate with block-based access protocols, such as the Small Computer Systems Interface (SCSI) protocol encapsulated over FC (FCP) or encapsulated over TCP/IP/Ethernet (iSCSI).
Each storage device in the block-based system is typically assigned a unique logical unit number (LUN) by which it, can be addressed, e.g., by remote clients. Thus, an “initiator” client system may request a data transfer for a particular range of data blocks stored on a “target” LUN. Illustratively, the client request may specify a starting data block in the target storage device and a number of successive blocks in which data may be stored or retrieved in accordance with the client request. For instance, in the case of a client “read” request, the requested range of data blocks is retrieved and then returned to the requesting client.
Operationally, the storage system typically identifies a read stream based on an ordered sequence of client accesses to the same file. As used hereinafter, a file is broadly understood as any set of data in which zero or more read streams can be established. Accordingly, the file may be a traditional file or directory stored on a file-based storage system.
Upon identifying a read stream, the storage system may employ speculative readahead operations to retrieve data blocks that are likely to be requested by future client read requests. These “readahead” blocks are typically retrieved from disk and stored in memory (i.e., buffer cache) in the storage system, where each readahead data block is associated with a different file-system VBN. Conventional readahead algorithms are often configured to “prefetch” a predetermined number of data blocks that logically extend the read stream. For instance, for a read stream whose client read requests retrieve a sequence of data blocks assigned to consecutively numbered file block numbers (FBNs), the file system may invoke readahead operations to retrieve additional data blocks assigned to FBNs that further extend the sequence, even though the readahead blocks have not yet been requested by client requests in the read stream.
Conventionally, predictive processing associated with readahead operations is computationally intensive and/or expensive in terms of system resources, caching, and/or data bus usage. Moreover, the result of the predictive processing of readahead analysis and/or execution only has beneficial results when appropriate disk input and/or output (I/O) operations are generated as a result. It is therefore desirable for a storage system to only selectively employ computationally intensive tasks such as predictive processing in conjunction with readahead analysis and/or readahead execution. Further, by reducing the amount of burdensome and/or unnecessary, processing, the storage system should reduce the negative effects of this type of waste on the system's performance.