A file server is a computer that provides file service relating to the organization of information on storage devices, such as disks. The file server orfiler includes a storage operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on the disks. Each “on-disk” file may be implemented as a set of data structures, e.g., disk blocks, configured to store information. A directory, conversely, may be implemented as a specially formatted file in which information about other files and directories are stored.
A filer may be further configured to operate according to a client/server model of information delivery to thereby allow many clients to access files stored on a server. In this model, the client may comprise an application, such as a database application, executing on a computer that connects to the filer over a computer network. This computer network could be a point to point link, a shared local area network (LAN), a wide area network (WAN) or a virtual private network (VPN) implemented over a public network such as the Internet. Each client may request the services of the file system on the filer by issuing file system protocol messages (typically in the form of packets) to the filer over the network.
The disk storage typically implemented has one or more storage “volumes” comprised of a cluster of physical storage disks, defining an overall logical arrangement of storage space. Currently available filer implementations can serve a large number of discrete volumes (150 or more, for example). Each volume is generally associated with its own file system. The disks within a volume/file system are typically organized as one or more groups of Redundant Array of Independent (or Inexpensive) Disks (RAID). RAID implementations enhance the reliability and integrity of data storage through the redundant writing of data stripes across a given number of physical disks in the RAID group, and the appropriate caching of parity information with respect to the striped data. The redundant information enables recovery of data lost when a storage device fails.
In the operation of a storage system comprising, for example, a number of filers, disk shelves, switches and other routing and networking devices, it is possible that a device will fail or suffer an error condition. A principle goal of a high-performance storage system is to ensure that data read/write operations can be performed even when a component of a storage network has failed. In one common implementation of a storage system, the physical disks used to store data may be connected to the file server by a Fibre Channel connection. Fibre Channel is a series of protocols defining a transport mechanism for high-speed data access. Fibre Channel a collection of different specifications which are defined in a variety of documents published by the American National Standards Council. These various Fibre Channel standards are available from the Fibre Channel Industry Association of San Francisco, Calif. Specifically, disks may be interconnected with a computer through a Fibre Channel Arbitrated Loop architecture. This architecture is defined in Fibre Channel Arbitrated Loop (FC-AL-2), published by the American National Standards Council, which is hereby incorporated by reference. With the use of Fibre Channel switches and other networking devices, an overall switching fabric of interconnected switches, disks and file servers can be provided. Many Fibre Channel disks employ dual connectors, labeled A and B. Through the use of the dual connectors, the disk can support connections through two discrete data paths. Typically, this dual-connection is used to provide a redundant second data path in the event of a failure of a first path. Note that by “data path” or “path” it is herein meant generally a connection from a file server to a storage device through various interconnections such as switches, disk shelves or other disks.
The Fibre Channel transport mechanism is a token-ring protocol. By “token-ring protocol” it is meant generally that each node in a Fibre Channel switching network participates in each data transaction at least to the point of buffering and retransmitting the data. This arrangement can be disadvantageous in certain circumstances. For example, should any node in a Fibre Channel network fail, the ring is broken and data will not reach its intended destination. Additionally, errors or failures in the physical cabling can result in a break of the ring with its associated loss of data delivery. These breaks in the Fibre Channel network can result in data failing to reach its destination and, in a file server environment, data loss or corruption.
In non-multi-path systems utilizing file servers and a plurality of data paths to and from disks, a low-level device driver operates to effectuate the multi-path operation of the disks. This can be accomplished, for example, by using a static routing table identifying the multiple paths from a file server to a given disk device. However, a noted disadvantage of known multi-path operations is that upper level services of the operating system are not exposed to, or do have access to, such routing information. Such upper level services generally include higher layers of an operating system above a disk driver or a routing layer, for example, a disk storage layer, a file system layer and a user interface or maintenance layer. It should be noted that the term “upper level services” should not be construed to only include these named storage operating system layers, but to include any other layers or processes executing on a computer that implements the teachings of this invention. Such upper level services can fail in the event of a path failure. The failure of such upper level services can result from the service remaining unaware of the existence of or use of multiple data paths to a given storage device. Such upper level services can fail, even though the lower level routing or disk driver layers are still capable of delivering data and input/output operations to a given storage device.