Various forms of network storage systems exist today, including network attached storage (NAS), storage area networks (SANs), and others. Network storage systems are commonly used for a variety of purposes, such as backing up critical data, data mirroring, providing multiple users with access to shared data, etc.
A network storage system includes at least one storage server, which is a processing system configured to store and retrieve data on behalf of one or more client processing systems (“clients”) that are used by users of the network storage system. In the context of NAS, a storage server is commonly a file server, which is sometimes called a “filer”. A filer operates on behalf of one or more clients to store and manage shared files. The files are stored in a non-volatile mass storage subsystem (which is typically external to the storage server, but does not have to be) which may include one or more arrays of non-volatile mass storage devices, such as magnetic or optical disks or tapes, by using RAID (Redundant Array of Inexpensive Disks). Hence, the mass storage devices in each array may be organized into one or more separate RAID groups.
In a SAN context, a storage server provides clients with access to stored data at a sub-file level of granularity, such as block-level access, rather than file-level access. Some storage servers are capable of providing clients with both file-level access and block-level access, such as certain Filers made by Network Appliance, Inc. (NetApp®) of Sunnyvale, Calif.
Caching is a technique that is commonly used to reduce latency associated with accessing data in computer-related applications, including in network storage systems. For example, the main memory (i.e., random access memory (RAM)) of a storage server is often used as a cache logically between the storage server's main central processing unit (CPU) and the non-volatile mass storage (e.g., disk) subsystem, since the RAM which forms the main memory generally has a much smaller access latency than the disk subsystem. Accordingly, the main memory of a storage server is sometimes called the “buffer cache” or, simply, the “cache”. Note that this kind of cache should not be confused with other forms of cache memory, known as level 1 (“L1”) cache, level-2 (“L2”) cache, etc., which are commonly used by a microprocessor (and typically implemented on the same chip or the same motherboard as the microprocessor) to reduce the number of accesses to main memory. In the context of this document, the buffer cache (or simply “cache”) of a storage server is the main memory itself.
Some network storage servers also employ an additional level of caching logically between the buffer cache (main memory) and the non-volatile mass storage subsystem. This additional cache is known as a “victim cache”. In the context of this document, a “victim cache” is a cache that holds some of the data blocks (“victims”) most recently evicted from a main or primary cache, which in this context is main memory of the storage server. The main memory in a storage server is in certain instances called the “main cache” in this document, to distinguish it from the victim cache.
A victim cache in a storage server is generally a medium-size auxiliary storage facility that is faster than normal RAID disk storage, but slower than main memory. Such a victim cache might be implemented on, for example, an external memory card, using solid state disks (SSDs) or other types of storage devices. The size of such a cache can range from, for example, a few GBytes up to hundreds of GBytes or more. When a data block, or “buffer”, is needed but not found in main memory, the victim cache is consulted prior to loading the buffer from RAID disks. Note that the terms “buffer” and “data block” (or simply “block”) are used herein interchangeably. A data block, or “buffer”, is the basic unit of data transfer of the file system in a storage server. Buffers are commonly 4 Kbytes in size, though different storage systems may use buffers of different sizes.
The process of finding an old buffer in the main cache and then evicting it is known as “buffer scavenging”. This process is often expected to return another buffer for the storage server to immediately overwrite (reuse) in the main cache. For optimal system performance, the buffer scavenging process should not be blocking; that is, the process which invokes (calls) the scavenging process should not be prevented from doing something else while the scavenging process executes (i.e., until a reusable buffer is returned). However, when a victim cache is used, there is generally latency associated with performing a write to the victim cache and then waiting for the response. Thus, there is a tension between the desire for fast, efficient scavenging and the need to preserve the evicted buffer and its contents until it has been successfully stored in the victim cache.