The present invention pertains to the field of managing information shared among data storage resources distributed in a clustered information technology environment and more specifically to a method and system for handling failover recovery of data management of a shared disk file system used in such a loosely coupled node cluster.
Enterprises with large or networked computing environments often employ distributed file systems. In recent years, the need to store high-resolution images, scientific data, etc., has created a serious imbalance between data input/output (I/O) and storage system performance and functionality. Thus the performance and capacity of current mass storage systems must improve by orders of magnitude.
To provide cost-effective access to data in such storage-intensive computing environments, mass storage systems must be integrated with the underlying distributed file systems. Thereupon, coupling mass storage systems with these file systems, provides a seamless view of the file system.
The ever-increasing demand for data storage capacity implies costs associated with managing the distributed storage system which have been significantly higher than the costs of the storage itself. Thus there is an ongoing need for intelligent and efficient storage management by way of a Data Management (DM) application.
The DM application migrates the data between a fast on-line storage of limited storage capacity and a tertiary storage archive. In addition, it provides on-line semantics for all the data stored in the tertiary archive, i.e. the users don't need to perform any administrative operations in order to access the data. Moreover, the DM Application recognizes any access to the archived data and automatically transfers the data to the user. For that reason, some monitoring facilities must be provided so that the DM application may need to be notified when a user attempts to read a block of data from a data file.
The predescribed concept, namely to free local storage space by migrating data to a remote storage device, is commonly known as Hierarchical Storage Management (HSM). The storage management is transparent to the user i.e. he still has the view as if the data are local.
In a file-based HSM the DM application generates so-called “stub files” as placeholders which just keep the file attributes. When accessing the stub files, or the correspondingly punched disk region(s), the data of the file (or a disk region) is recalled from the remote storage device again. Typically HSM is installed on a file server storing a large number of rarely accessed data (e.g. archived weather maps, video presentations).
It is further known that the availability of a predescribed distributed mass storage system, i.e. the availability of the combination of the underlying data storage devices and DM application(s) can be improved if a part of the storage system can take over services of a failure storage system, usually designated as “failover”.
Thereupon, file systems are known which manage the sharing of disks across multiple host machines such as the General Parallel File System (GPFS) running on AIX SP (UNIX-based Scalable Power Parallel Computer) developed and sold by the present applicant. In order to allow DM applications to be developed much like ordinary software applications, a Data Management Application Interface (DMApi) (specified by the Data Management Interfaces Group (DMIG) consortium) has been proposed which is implemented by the file system and used by a Data Management (DM) Application to perform the following functions:                Hierarchical Storage Management (HSM)        Data backup and restore        
The DMApi is targeted to provide an environment which is suitable for implementing robust, commercial-grade DM applications. In a shared disk environment the DMApi can particularly include facilities for DM application crash recovery and stateful control of the file system objects.
In a cluster of loosely coupled computer nodes, which is particularly addressed by the present invention, each node comprises a DM application providing storage management support which requires so-called “DMApi events” which can be synchronous or asynchronous. DMApi events are mechanisms that allow a DM application to be notified whenever certain operations occur in an underlying operating system implemented on a certain node of the cluster. By these mechanisms DMApi sessions can be taken over by another node which generates a single point of failure. The DMApi sessions are the primary communication channels between a DM application and a Kernel component of the DMApi implemented in the underlying operating system.
In a classic one-node/computer environment file system services would end in case of a system failure. In a cluster environment it is most likely that a single node failure does not effect other (independent) nodes within the system. If the DM Application resides on the failure node, the access to stubbed files gets unavailable which potentially interrupts running processes on active cluster nodes. Therefore it is desirable to migrate the DM application to an active cluster node recovering the HSM functionality in order to leave other cluster nodes unaffected by the initial node failure.