The present invention is generally directed to a method for controlling access to data in a shared disk parallel file data processing system. More particularly, the invention is directed to a method which prevents system shutdown in a two-node quorum based system which would otherwise occur as a result of a communications failure between nodes which prevents coordinated data access.
Shared disk file systems allow concurrent shared access to data contained on disks attached by storage area networks (SAN). SANs provide physical level access to data on the disk to a number of systems. The shared disks are split into partitions which provide a shared pool of physical storage without common access or, with the aid of a shared disk file system or database manager, coherent access to all the data from all of the systems is provided by SAN. IBM's GPFS (General Parallel File System) is a file system which manages a pool of disks and disk partitions across a number of systems allowing high speed direct access from any system and which provides aggregate performance across a single file system which exceeds that available from any file system managed from a single system. The present invention addresses an aspect of bringing such multi-system power to bear on aspects of file system operation.
In the GPFS shared disk file system each node (each with one or more processors) has independent access to the disk, and data and metadata consistency is maintained through the use of a distributed lock manager (or token manager). This requires that all participating nodes be capable of communicating and participating in a locking protocol. A node which is not capable of participating in a locking protocol must not access data and there must be mechanisms for other nodes to reclaim control of metadata being modified at nodes which have failed or which have lost the capability of internode communication. GPFS provides such capabilities in clusters of three or more nodes using a quorum mechanism. There is a desire for the capability of sharing data among two nodes if the two nodes satisfy certain operational requirements. It is also desirable that one node be able to continue operation in the event of the failure of the other or in the event that network communications between the two nodes is lost. For more background information see “Parallel File System and Method for Independent Metadata Loggin” (U.S. Pat. No. 6,021,508 issued Feb. 1, 2000).
The concept of a quorum of nodes is part of the existing GPFS recovery model that avoids multiple instances of the token manager handing out tokens for the same objects or making conflicting locking decisions. GPFS currently requires a quorum of nodes (usually, one plus half of the number of nodes in the GPFS nodeset) to be active as a member of a group before any data access operations can be honored. This requirement guarantees that a valid single token management domain exists for each GPFS file system. Prior to the existence of a quorum, most requests are rejected with a message indicating that quorum does not exist. If an existing quorum is lost, GPFS exits all nodes to protect the integrity of the data.
In a two node system, the multi-node quorum requirement is two, meaning that both participating nodes must be members of the group before the GPFS file system operation is honored. In order to relax this requirement so that operations are allowed when a single node is available, GPFS provides support for single-node quorum operation in a two node nodeset. The main issue of single node quorum operation in a two-node nodeset is the assurance that there is only one lock manager (i.e., only one token management domain) for the shared disk file system, so that data consistency and integrity is protected.
A simple way of doing this is through the creation of a third (tie breaker) node which referees situations where one node appears to be down. This is easy to implement but then a quorum requires both nodes or one node plus the tie breaker node. It does not solve a true two-node nodeset problem where there is not a third node available.
The solution described herein modifies existing quorum behavior for three-node or greater nodesets to support single-node quorum behavior in a two node nodeset. It meets the desired objective of allowing either node to fail while still permitting the other node to continue accessing data in the file system.
There are only two methods which are used for solving the need to share data. One is a quorum of some type similar to the basic GPFS design. The other method is an approach where one node is designated as “privileged” and any group which contains this node can continue. This second method creates either a single point of failure for the entire cluster of nodes which shares the data, or a requirement for manual intervention to move the privileged node and keep track of the movement in some highly available storage. The present invention avoids all of these problems.