A file server is a network-connected processing system that stores and manages shared files in a set of storage devices (e.g., disk drives) on behalf of one or more clients. The disks within a file system are typically organized as one or more groups of Redundant Array of Independent/Inexpensive Disks (RAID). One configuration in which file servers can be used is a network attached storage (NAS) configuration. In a NAS configuration, a file server can be implemented in the form of an appliance that attaches to a network, such as a local area network (LAN) or a corporate intranet. An example of such an appliance is any of the Filer products made by Network Appliance, Inc. in Sunnyvale, Calif.
Another specialized type of network is a storage area network (SAN). A SAN is a highly efficient network of interconnected, shared storage devices. Such devices are also made by Network Appliance, Inc. One difference between NAS and SAN is that in a SAN, the storage appliance provides a remote host with block-level access to stored data, whereas in a NAS configuration, the file server normally provides clients with only file-level access to stored data.
A simple example of a NAS network configuration is shown in FIG. 1. A filer (file server) “head” 2 is coupled locally to a set of mass storage devices 4 and to a set of clients 1 through a network 3. The filer head 2 receives various read and write requests from the clients 1 and accesses the mass storage devices 4 to service those requests. Each of the clients 1 may be, for example, a conventional personal computer (PC), workstation, or the like. The mass storage devices 4 may be, for example, conventional magnetic disks, optical disks such as CD-ROM or DVD based storage, magneto-optical (MO) storage, or any other type of non-volatile storage devices suitable for storing large quantities of data.
In this context, a “head” (as in filer head 2) means all of the electronics, firmware and/or software (the “intelligence”) that is used to control access to a set of mass storage devices; it does not include the mass storage devices themselves. In a file server, the head normally is where all of the “intelligence” of the file server resides. Note that a “head” in this context is not the same as, and is not to be confused with, the magnetic or optical head that is used to physically read or write data from or to the mass storage medium. The network 3 can be essentially any type of computer network, such as a local area network (LAN), a wide area network (WAN), metropolitan area network (MAN), or the Internet.
Filers are often used for data backup and recovery applications. In these applications, it is desirable to protect against as many potential failure scenarios as possible. One possible failure scenario is the failure of a filer head. One approach which has been used to protect against the possibility of a filer head failure is known as clustered failover (CFO). CFO involves the use of two or more redundant filer heads, each having “ownership” of a separate set of mass storage devices. CFO refers to a capability in which two or more interconnected heads are both active at the same time, such that if one head fails or is taken out of service, that condition is immediately detected by the other head, which automatically assumes the functionality of the inoperative head as well as continuing to service its own client requests. A file server “cluster” is defined to include at least two file server heads connected to at least two separate volumes of disks. FIG. 2 illustrates an example of a CFO configuration. As shown, each filer head's mass storage devices are “visible” to the other filer, via a high-speed interconnect. In the event one head fails, the other head takes over ownership of the failed head's mass storage devices.
In a CFO configuration it is desirable for one head to have the ability to perform diagnostics on the other head (or heads), to assess its operational status. Moreover, it is desirable to have the ability to perform such diagnostics without taking the head under test out of its normal operational mode.