The present invention pertains to file servers and more particularly to disk configurations of file servers.
A file server is a computer that provides file service relating to the organization of information on storage devices, such as disks. The file server or filer includes a storage operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on the disks. Each xe2x80x9con-diskxe2x80x9d file may be implemented as a set of disk blocks configured to store information, such as text, whereas the directory may be implemented as a specially-formatted file in which information about other files and directories are stored. A filer may be configured to operate according to a client/server model of information delivery to thereby allow many clients to access files stored on a server, e.g., the filer. In this model, the client may comprise an application, such as a file system protocol, executing on a computer that xe2x80x9cconnectsxe2x80x9d to the filer over a computer network, such as a point-to-point link, shared local area network (LAN), wide area network (WAN), or virtual private network (VPN) implemented over a public network such as the Internet. Each client may request the services of the filer by issuing file system protocol messages (in the form of packets) to the filer over the network.
As used herein, the term xe2x80x9cstorage operating systemxe2x80x9d generally refers to the computer-executable code operable on a storage system that manages data access and client access requests and may implement file system semantics in implementations involving filers. In this sense, the Data ONTAP(trademark) storage operating system, available from Network Appliance, Inc. of Sunnyvale, Calif., which implements a Write Anywhere File Layout (WAFL(trademark)) file system, is an example of such a storage operating system implemented as a microkernel. The storage operating system can also be implemented as an application program operating over a general-purpose operating system, such as UNIX(copyright) or Windows NT(copyright), or as a general-purpose operating system with configurable functionality, which is configured for storage applications as described herein.
Disk storage is typically implemented as one or more storage xe2x80x9cvolumesxe2x80x9d that comprise physical storage disks, defining an overall logical arrangement of storage space. Currently available filer implementations can serve a large number of discrete volumes (150 or more, for example). Each volume is associated with its own file system and, for purposes hereof, volume and file system shall generally be used synonymously. The disks within a volume are typically organized as one or more groups of Redundant Array of Independent (or Inexpensive) Disks (RAID). RAID implementations enhance the reliability/integrity of data storage through the redundant writing of data xe2x80x9cstripesxe2x80x9d across a given number of physical disks in the RAID group, and the appropriate caching of parity information with respect to the striped data. As described herein, a volume typically comprises at least one data disk and one associated parity disk (or possibly data/parity) partitions in a single disk) arranged according to a RAID 4, or equivalent high-reliability, implementation.
File servers can be configured in a variety of ways, including, for example as a regular file server with multiple data paths to disks, as a file server in a cluster of file servers or as a file server providing mirrored service to disks. In each of these configurations, the disk configuration, or actual physical wiring cabling of the disks, needs to be validated against certain standards and rules. As used herein xe2x80x9cdisk configurationxe2x80x9d should be taken generally to mean the actual physical cabling of disks, disk shelves and file servers in a given file server implementation. Improper disk configurations can result in data corruption or data loss if a file server uses an improperly configured disk for I/O operations. Disks which are improperly cabled may function for a period of time, but can cause data corruption and/or other errors within a file server.
In known file server implementations, the disk configuration is verified once at file server boot time. However, any disks that were added to the disk configuration or any disk that had wiring changed after filer boot time are typically not reverified for a proper configuration. A noted disadvantage of this implementation is that an improper disk configuration could result from disks being added or reconnected after the filer boot time. This misconfiguration could cause data corruption with the misconfigured disk drives. Additionally, when the file server is eventually rebooted, the reinitialization often fails since the file server is unable to verify the disk configuration. As the reinitialization failure can occur weeks or months after the change in configuration, there is no readily apparent cause-and-effect with the configuration change. This lack of discernable cause-and-effect between configuration and failure hampers system administrators or users from detecting and correcting the misconfigured disk drives.
Another known implementation to verify the disk configuration of a file server involves scanning all disk drives once at boot time, and then periodically looping through all of the possible disk drives to verify continued configuration. A noted disadvantage of this technique is that disks can be moved or added in times that occur between the loops of the scanning. When a misconfiguration occurs in between scanning loops, the file server can not detect the misconfiguration until the next scanning loop. Data corruption or loss may result from the use of misconfigured disks in the interim. Scanning loops cause degradation in system performance due to increased processing overhead. By extending the time between scanning loops, improved system performance is achieved, but at a risk of data corruption. Similarly, by decreasing the time between loops, the risk of data corruption is lessened but at a significant cost of performance. Additionally, this methodology typically scans all possible disk drive slots, whether utilized by the file server or not. This results in a large processing overhead cost regardless of how few slots are actually employed by disks.
The disadvantages of the prior art are overcome by providing a system and method for verifying the disk configuration of a given computer, e.g., a file server by performing a real-time check of the configuration after each event that modifies the configuration of the disks. This check only involves actual connected disks and not empty slots.
The disk configuration verification functionality or layer of the storage operating system receives event verifications from the low-level disk driver. These event notifications alert the disk configuration verification layer to changes in the disk topology of disks connected to a particular file server. The disk verification layer compares the disk""s configuration and topology to a set of rules defined for a given file server configuration. If the actual disk configuration can result in data loss or corruption in a given file server implementation, the disk configuration verification layer halts the file server and/or issues a warning to the user or administrator. A halt occurs if the configuration can lead to data loss or corruption. Additionally, warnings are issued if the configuration does not lead to data loss or corruption but is not an optimal configuration.