A Storage Area Network (SAN) is a dedicated, specialized network for transporting data at high speeds between a plurality of disks and a plurality of computers or nodes, termed a cluster. A typical SAN only allows each node to see its own zone or subsection of the SAN, which may comprise one or more volumes of the SAN. Each volume is a set of disks configured to appear as a single disk. A volume of the SAN may not be accessed concurrently by more than one node, as this leads to corruption of data in the volume. Thus, while a conventional SAN consolidates storage resources into a single array and offers advantages over previous approaches using directly attached storage (DAS), it does not provide data sharing mechanisms between storage volumes in the SAN.
However, there exists an increasing need for data generated on one node to be accessible by another node. Network file systems (NFS) offer a means to achieve such a transfer. Such systems involve the retrieval of data from a first volume of the SAN by a first node, transmission of the data over a local area network (LAN) such as an Ethernet or Gigabit Ethernet to a second node, and storage of the data in a second volume of the SAN by the second node. Such a transfer path for data leads to duplication of data in the SAN and thus wasted disk space, and causes increased load or traffic on the LAN. Such transfers further require a significant amount of time, particularly in the case of data intensive operations shared over a plurality of computers, such as post production of film, television or advertisements, satellite stream acquisition, media broadcasts, and meteorological applications. Indeed, the LAN can be overloaded by such transfers and become a bottleneck to the data transfer, and thus further delay the tasks to be performed by each node in respect of the data being transferred. Other techniques include CIFS and FTP transfers, which suffer from similar disadvantages.
A shared file system is a concept that allows many computers to access a single file system and treat it as if it were local. Attempts have been made to implement an effective shared file system, however most such attempts to date have been limited to very specific storage architectures, homogeneous computer architecture, and the same operating system on all computers. Thus, such solutions necessitate a significant capital outlay to ensure such hardware and architecture requirements are met, and do not allow use of heterogeneous equipment which may already be in place.
In very recent times, a shared file system interoperable with heterogeneous hardware and operating systems has been developed by Silicon Graphics, Inc under the name CXFS. The CXFS system, set out in U.S. patent application Publication No. US2003/0078946, the contents of which are incorporated herein by reference, is able to accommodate all major operating systems such as SGI™, IRIX, Linux™, Microsoft™ Windows™, Apple™ Mac OS™ X, Sun Microsystems Solaris™, and IBM™ AIX™. CXFS allows direct access to the SAN from all the connected nodes and maintains data coherency by leasing out tokens for various actions. For instance, read/write tokens exist for access to individual files and tokens exist for allocating new disk block extents. One of the nodes serves as a CXFS metadata server for each file system and controls granting and replication of tokens. Relocation recovery of metadata servers is supported in CXFS, should the metadata server node become disconnected, with or without warning.
Further, reliable data rate access to storage is needed by many applications, such as broadcast, multicast and editing of digital media files, and sensor data collection and processing. Many ways of providing guaranteed rate data access have been proposed and implemented including Guaranteed Rate I/O (GRIO) disk bandwidth scheduler, available from Silicon Graphics, Inc. (SGI) of Mountain View, Calif. In conjunction with the XLV disk volume manager, also available from SGI, guaranteed disk bandwidth reservations are provided by GRIO at the local client level, to DAS. Bandwidth reservations can be attached to individual files or entire file systems and can be shared between processes. The local DAS must be configured appropriately to support GRIO. If the data rate required by an application is greater than can be provided by a single disk, the disk must be in a volume with the data striped across several disks or staggered to multiple disks so that different processes can access different disks independently.
GRIO is an integral part of the local I/O system in IRIX (SGI's version of UNIX) to ensure that guaranteed rate access can be guaranteed. GRIO uses a frame-based disk block scheduler without reordering requests and maintains a database of the different pieces of hardware in the system and their performance characteristics. When a request for a bandwidth reservation is received from a process executing on the local client node, determinations of available bandwidth are made for components along the entire physical I/O path, starting with the I/O adapter accessed by multiple processors and ending with the local data storage, and including the storage devices, SCSI and Fibre Channel buses, system interconnects and bridges. The total reservations for all processes at each component along the path is kept below the total available bandwidth for that component. If this level would be exceeded, the GRIO daemon denies the request. Excess capacity may be used for overband consumption by a process provided the remaining reservations will not be adversely affected during the period of the overband request.
To date GRIO is available only for individual client nodes accessing directly attached storage (DAS), and no known client software solutions provide guaranteed rate access to a shared file system shared by a cluster of nodes via a storage area network (SAN). The closest known solution is to copy files stored on a SAN to local storage at a particular node and use GRIO to control synchronization of accesses to the files in local storage. This technique is adequate for some uses, such as non-linear editing; but is less than desirable for large-scale on-demand multicasting of video files, for example, due to the large amount of extra local storage that would be required and would not be needed if guaranteed rate access to the resources of the SAN were supported.
There are several benefits of SANs that are not obtained by the solution described above. Fault tolerance for accesses to the data is one of the primary benefits of a SAN. In addition, load balancing and enabling heterogeneous client access to the same physical storage are also benefits that can be obtained by a shared file system using a SAN.
Unless otherwise specified, the term GRIO in the following makes reference to an improved system for guaranteed rate I/O access for processes to a file storage, and not to the guaranteed rate I/O solution discussed in the preceding paragraphs.