A storage system typically comprises one or more storage devices into which information may be entered, and from which information may be obtained, as desired. The storage system includes a storage operating system that functionally organizes the system by, inter alia, invoking storage operations in support of a storage service implemented by the system. The storage system may be implemented in accordance with a variety of storage architectures including, but not limited to, a network-attached storage environment, a storage area network and a disk assembly directly attached to a client or host computer. For example, the storage devices may comprise disk drives organized as a disk array, wherein the term “disk” commonly describes a self-contained rotating magnetic media storage device. The term disk in this context is synonymous with hard disk drive (HDD) or direct access storage device (DASD).
The storage operating system of the storage system may implement a high-level module, such as a file system, to logically organize the information stored on volumes as a hierarchical structure of storage objects, such as files and logical units (LUs). A known type of file system is a write-anywhere file system that does not overwrite data on disks. An example of a write-anywhere file system that is configured to operate on a storage system is the Write Anywhere File Layout (WAFL®) file system available from NetApp, Inc. Sunnyvale, Calif.
The storage system may be further configured to allow many servers to access storage objects stored on the storage system. In this model, the server may execute an application, such as a database application, that “connects” to the storage system over a computer network, such as a point-to-point link, shared local area network (LAN), wide area network (WAN), or virtual private network (VPN) implemented over a public network such as the Internet. Each server may request the data services of the storage system by issuing access requests (read/write requests) as file-based and block-based protocol messages (in the form of packets) to the system over the network.
A plurality of storage systems may be interconnected to provide a cluster architecture configured to service many servers. The cluster architecture may provide a shared storage pool comprising one or more aggregates. Each aggregate may comprise a set of one or more storage devices (e.g., disks). Each aggregate may store one or more storage objects, such as one or more volumes. The aggregates may be distributed across a plurality of storage systems of the cluster architecture. The storage objects (e.g., volumes) may be configured to store content of storage objects, such as files and logical units, served by the cluster in response to multi-protocol data access requests issued by servers.
Each storage system (node) of the cluster may include (i) a storage server (referred to as a “D-blade”) adapted to service a particular aggregate or volume and (ii) a multi-protocol engine (referred to as an “N-blade”) adapted to redirect the data access requests to any storage server of the cluster. In the illustrative embodiment, the storage server of each storage system is embodied as a disk element (D-blade) and the multi-protocol engine is embodied as a network element (N-blade). The N-blade receives a multi-protocol data access request from a client, converts that access request into a cluster fabric (CF) message and redirects the message to an appropriate D-blade of the cluster.
The storage systems of the cluster may be configured to communicate with one another to act collectively to increase performance or to offset any single storage system failure within the cluster. The cluster provides data service to servers by providing access to a shared storage (comprising a set of storage devices). Typically, servers will connect with a storage system of the cluster for data-access sessions with the storage system. During a data-access session with a storage system, a server may submit access requests (read/write requests) that are received and performed by the storage system.
Each storage system may receive read requests for data stored on a storage device (e.g., a large capacity storage device such as a disk). In response, the storage system may transmit data from a storage device to a client associated with the read request. However, such read requests may take a significant time to respond to and cause performance limitations of the storage system. For example, retrieving and transmitting requested data from a storage device in response to a read request may produce a slow response time.
For improved response to received read or write requests, the storage system may temporarily store/cache particular selected data in a smaller cache device for faster access. The cache memory may comprise a memory device having lower random read-latency than a typical storage device and may thus still provide faster data access than a typical storage device. However, the cache memory may comprise a memory device that is more costly (for a given amount of data storage) than a typical large capacity storage device. Since the storage size of the cache device is relatively small, data stored in the cache device is chosen selectively by cache prediction algorithms (sometimes referred to as cache warming algorithms). A cache prediction algorithm may use cache metadata describing caching operations of a cache device to select which data to store to the cache device. In essence, the cache prediction algorithm is predicting that the selected data will be requested and accessed by a client relatively soon, thus the selected data is pre-loaded to the cache device.
As described above, one or more storage systems may store data to a shared storage pool (comprising a set of storage devices) for providing storage services to one or more servers and. One or more cache devices may reside and operate on each storage system. Conventional methods, however, do not provide an efficient and accurate method for processing cache metadata from multiple cache devices. Without improved methods for processing cache metadata from multiple cache devices, cache prediction algorithms using the cache metadata may not select desirable data for storage to the cache devices, thus increasing response times to received access requests.