A file server is a computer that provides file service relating to the organization of information on writeable persistent storage devices, such memories, tapes or hard disks. The file server or filer may be embodied as a storage system including an operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on, e.g., the disks. Each “on-disk” file may be implemented as set of data structures, e.g., disk blocks, configured to store information, such as is the actual data for the file. A directory, on the other hand, may be implemented as a specially formatted file in which information about other files and directories are stored. An example of a file system that is configured to operate on a filer is the Write Anywhere File Layout (WAFL™) file system available from Network Appliance, Inc., Sunnyvale, Calif.
As used herein, the term “storage operating system” generally refers to the computer-executable code operable to perform a storage function in a storage system, e.g., that implements file system semantics and manages data access. In this sense, the ONTAP software is an example of such a storage operating system implemented as a microkernel and including a WAFL layer to implement the WAFL file system semantics and manage data access. The storage operating system can also be implemented as an application program operating over a general-purpose operating system, such as UNIX® or Windows NT®, or as a general-purpose operating system with configurable functionality, which is configured for storage applications as described herein.
A filer cluster is organized to include one or more filers or storage “volumes” that comprise a cluster of physical storage disks, defining an overall logical arrangement of storage space. Currently available filer implementations can serve a large number of discrete nodes or volumes. Each volume is generally associated with its own file system (WAFL for example). The disks within a volume/file system are typically organized as one or more groups of Redundant Array of Independent (or Inexpensive) Disks (RAID). RAID 4 implementations enhance the reliability/integrity of data storage through the redundant writing of data “stripes” across a given number of physical disks in the RAID group, and the appropriate storing of parity information with respect to the striped data. In the example of a WAFL-based file system, a RAID 4 implementation is advantageously employed and is preferred. This implementation specifically entails the striping of data bits across a group of disks, and separate parity storage within a selected disk (or disks) of the RAID group.
The Network File System (NFS) is a stateless UNIX based file system protocol used with filers that is generally not used with certain operating systems, such as Windows, running on most personal computers (PCs). The NFS protocol emphasizes error recovery over file locking; such error recovery is simple because no state information need be preserved. However, the protocol can “hang up” for hours without any timeout, which is detrimental. In addition, since NFS is not found in most PC operating systems it is not widely used in filers that are accessed by PC clients.
The Common Internet File System (CIFS) is an open standard, connection oriented protocol providing remote file access over the Internet that is typically used with filers to provide service to PCs. For example, CIFS is used in the Windows NT, 9X, ME and 2000 operating systems, Windows for WorkGroups and LAN Manager. Accordingly, it is widely used with servers, such as filers, that have PC clients accessing them. CIFS is not stateless and emphasizes locking over error recovery. Strict locking requires a sustained connection so it is important that an active session not be interrupted.
It is advantageous for the services and data provided by a storage system to be available for access to the greatest degree possible. Accordingly, some computer storage systems provide a plurality of filers in a cluster, with the property that when a first filer fails, a second filer is available to take over and provide the services and the data otherwise provided by the first filer. The second filer provides these services and data by a “takeover” of resources otherwise managed by the failed first filer.
Both filers store their WAFL, RAID and other information, as well as that of their partners, in non-volatile random access memories (NVRAMs) as part of their normal operation. The NVRAM is typically organized into two halves or segments, a local half or segment for storing requests directed to the local filer and a partner half or segment for storing requests “mirrored” from the partner. Each segment comprises a plurality of sections including RAID log, syslog and WAFL log sections. Each WAFL log consists of two portions designated log 0 and log 1.
As a file service request is processed by the WAFL file system, an entry for that request is written into the WAFL log as a journal entry. The journal entry may comprise, for example, “Create file”, “Write file Data”, “Open file”, etc. Widely accepted file system standards, such as NFS, specify that a file server should not reply to a requesting client until a given request is written out to stable storage. By writing to NVRAM this requirement is met and a reply can be returned to the requesting client with respect to the service request before the results of the request have been written to a hard disk.
The NVRAM is temporarily loaded with service requests until such time as a consistency point (CP) is reached. CPs may occur at fixed time intervals, or when other key events arise, such as either log 0 or log 1 in the WAFL log section being filled. At such times, the accumulated contents of the log 0 or log 1 are “flushed” (written) to hard disk, thereby completing the CP.
When log 0 is filled, a CP is initiated and subsequent service request entries are stored in log 1. The entries in log 0 are then flushed to hard disk. Similarly, when log 1 is filled, another CP is triggered and subsequent service request entries are stored in log 0. The entries in log 1 are then flushed to hard disk. Once the information recorded in log 0 or log 1 are written to hard disk they are removed from the NVRAM. This process continues as each log fills, triggering a CP, and clearing the NVRAM.
After a takeover by a partner filer from a failed filer, the partner handles file service requests that have normally been routed to it from clients, plus file service requests that had previously been handled by the failed filer and that are now routed to the partner. Broadly stated, a takeover of a failed filer involves the partner filer asserting disk reservations to take over responsibility of the disks of the failed filer, and then sending a series of “please die” commands (“poison packets”) to the failed filer.
The partner filer then “replays” the mirrored WAFL and RAID log entries of the failed filer stored in its NVRAM. A replay comprises flushing of the log entries to disk. As part of takeover processing the partner takes on two identities: its own identity and the identity of the failed filer. To that end, the partner activates network interfaces and network addresses that replicate the failed filer's network addresses. The identity, replicated network interfaces and network addresses are used to process service requests directed to the failed filer until the failed filer is restored and control is returned to it.
The partner filer then begins processing requests directed to the failed filer. These processed requests are temporarily stored in only the local half of the partner's NVRAM. That is, the WAFL entries for the failed filer are interleaved with WAFL entries for the partner filer in log 0 until it is full. After a CP, the entries are interleaved and stored within log 1 until it is full. Notably, only the local filer half of the NVRAM is used, while the half assigned for use of the failed filer is unused. This is clearly inefficient.
Subsequently, after correction of a failure, the “failed” filer is rebooted and resumes normal operation. That is, after the problem that caused filer failure has been cured, the failed filer is rebooted, returned to service, and file service requests are again routed to the rebooted filer. If there is a problem with the failed filer that prevents it from being rebooted, or there is a problem with other equipment to which with the failed filer is connected that prevents the rebooted filer from going back online and handling file service requests, the filer remains offline until the other problems are repaired.
Accordingly, it would be advantageous to utilize the unused half of the NVRAM during a takeover operation to increase the efficiency of the WAFL file system by providing additional NVRAM space to store log entries processed by the file system.