Various forms of network storage systems are known today. These forms include network attached storage (NAS), storage area networks (SANs), and others. Network storage systems are commonly used for a variety of purposes, such as providing multiple users with access to shared data, backing up critical data (e.g., by data mirroring), etc.
A network storage system can include at least one storage server, which is a processing system configured to store and retrieve data on behalf of one or more storage client processing systems (“clients”). In the context of NAS, a storage server may be a file server, which is sometimes called a “filer.” A filer operates on behalf of one or more clients to store and manage shared files in a set of mass storage devices, such as magnetic or optical disks or tapes. The mass storage devices may be organized into one or more volumes or aggregates of a Redundant Array of Inexpensive Disks (RAID). Filers are made by NetApp, Inc. of Sunnyvale, Calif. (NetApp®).
In a SAN context, the storage server provides clients with block-level access to stored data, rather than file-level access. Some storage servers are capable of providing clients with both file-level access and block-level access, such as certain Filers made by NetApp.
In a large scale storage system, it is possible that data may become corrupted or stored incorrectly from time to time. Consequently, virtually all modern storage servers implement various techniques for detecting and correcting errors in data. RAID schemes, for example, include built-in techniques to detect and, in some cases, to correct corrupted data. Error detection and correction is often performed by using a combination of checksums and parity. Error correction can also be performed at a lower level, such as at the disk level.
In file servers and other storage systems, occasionally a write operation executed by the server may fail to be committed to the physical storage media, without any error being detected. The write is, therefore, “lost.” This type of the error is typically caused by faulty hardware in a disk drive or in a disk drive adapter dropping the write silently without reporting any error. It is desirable for a storage server to be able to detect and correct such “lost writes” any time data is read.
While modern storage servers employ various error detection and correction techniques, these approaches are inadequate for purposes of detecting a lost write error. For example, in at least one well-known class of file server, files sent to the file server for storage are first broken up into 4 kilobyte (Kb) blocks, which are then formed into groups that are stored in a “stripe” spread across multiple disks in a RAID array. File system context information, such as a file identifier, a file block number (FBN), and other information such as a checksum, a volume block number (VBN) which identifies the logical block number where the data are stored (since RAID aggregates multiple physical drives as one logical drive), a disk block number (DBN) which identifies the physical block number within the disk in which the block is stored are stored in block-appended metadata fields. In one known implementation, the context information is included in a 64-byte checksum area structure that is collocated with the block when the block is stored. This error detection technique is sometimes referred to as “block-appended checksum.” Another type of checksum is referred to as “zone checksum.” In zone checksum, a disk is divided into small zones and a special block within each zone is used to store the 64-byte checksum area structures for the remaining blocks in the same zone. Block-appended checksum and zone checksum can detect corruption due to bit flips, partial writes, sector shifts and block shifts. However, it cannot detect corruption due to a lost block write, because all of the information included in the identity structure will appear to be valid even in the case of a lost write. Furthermore, this mechanism can detect data corruption only when the data blocks are accessed through the file system. When block reads are initiated by a RAID layer, such as to compute parity, to “scrub” (verify parity on) an aggregate, or to reconstruct a block (e.g., from a failed disk), the RAID layer does not have the context information of the blocks. Therefore, this mechanism does not help to detect lost writes on RAID generated reads.