1. Field of the Invention
The present invention relates to a data access management system for managing an access to a file on a shared disk in a distributed processing system in which each of a plurality of computers in a network environment is capable of directly controlling the shared disk.
2. Related Background Art
A distributed processing system in which processes are distributed to a plurality of computers is provided for a UNIX system. This type of distributed processing system must keep a consistency (matching characteristic) of the data to be processes by the respective computers as the whole distributed processing system, and hence it is undesirable that two or more pieces of the same data exist within the distributed processing system. Accordingly, a disk for storing the data existing in a unique location within the distributed processing system is required to be what is accessible from each of the computers. Further, with a spread of the Internet and a trend of information-oriented society over the recent years, it is required that the computers share the information with each other.
For this purpose, there is actualized a shared file system for sharing one single disk or a plurality of disks (which will hereinafter be referred to as a “shared disk”) physically or logically distributed between the respective computers within the distributed processing system. For example, the plurality of computers are connected to each other via a network and further connected to one single shared disk via SAN (Storage Area Network), thereby actualizing the shared file system in such a form that the data within the shared disk are accessed directly from the computers not via the network but via SAN.
It is required that this shared file system be constructed so that one piece of data is visible at the same time from the plurality of computers. On the other hand, while a certain computer is updating a certain piece of data, this piece of data must be locked (unaccessible from) to other computers. Such a control of inhibiting reference to the data to which one computer is executing writing process from other computers is known as “exclusive access control”.
Several methods for exclusive access control have hitherto been actualized. Hereinafter, one of them is briefly explained. According to this exclusive access control method, an access-oriented subsystem receiving a data access request from an application to undertake the access to the shared file system is executed on each of the plurality of computers connected to each other via the network, and a management-oriented subsystem for giving an authority for accessing to the access-oriented subsystem on each computer is executed on one specified computer among the plurality of computers. Then, the access-oriented subsystem, upon receiving the data access request from the application on a certain computer, inquires the management-oriented subsystem whether the data are accessible. The management-oriented subsystem receiving this inquiry distinguishes type of the requested data access. Then, in case the requested data access is categorized as data reading, it issues a read-only token indicating the authority for reading objective data to the inquiring access-oriented subsystem, as far as the access-oriented subsystems of other computers do not execute writing process to the same data. On the contrary, in case the requested data access is categorized as data writing, the management-oriented subsystem issues a write-only token indicating the authority for writing the objective data to the inquiring access-oriented subsystem, as far as the access-oriented subsystems of other computers do not execute reading nor writing process to the same data. With this contrivance, the exclusive access control for inhibiting the access to the data being updated by the one computer from other computers is actualized.
This exclusive access control system described above must be constructed so that only one write-only token which can be issued by the management-oriented subsystem exists for every block in a storage area in the shared disk, to which each piece of data is allocated. Accordingly, the management-oriented subsystem saves this write-only token in advance in access control data, then fetches the write-only token from the access control data in response to a request from the access-oriented subsystem, and issues it to the requesting access-oriented subsystem. Further, the management-oriented subsystem, each time the access-oriented subsystem writes the data to the shared disk, records the data writing as log data in the shared disk, and updates at a predetermined timing, based on the log data, management data which will hereinafter referred to as “metadata” recorded on the shared disk in order to manage the respective pieces of data as files.
On the other hand, Japanese Patent Application No. 11-143502 is descriptive of an exclusive access control system capable of making it unnecessary for each access-oriented subsystem to inquire the management-oriented subsystem about a data writing target block on the shared disk by previously transferring a management of a part of the storage area in the shared disk from the management-oriented subsystem to the access-oriented subsystem, in order to enhance an execution performance of the distributed processing system as a whole. According to the exclusive access control system described in the above Patent Application, with respect to a storage area (which will hereinafter termed a “reserve area”) of which management is transferred from the management-oriented subsystem, each access-oriented subsystem obtains more of the write-only tokens from the management-oriented subsystem and may save these tokens in its own access control data. Then, each access-oriented subsystem, based on the write-only tokens in its own access control data, allocates the blocks in the reserve area managed by itself to the data requested to be written into the shared disk by the application. Hence, there is no necessity for the access-oriented subsystem to inquire accessibility of data from the management-oriented subsystem each time data is updated.
The metadata about the blocks in the reserve area which have thus been allocated to data by the access-oriented subsystem are updated within this access-oriented subsystem, and the management-oriented subsystem is notified of the updated metadata at a proper timing. The management-oriented subsystem having received this notification updates the metadata held by itself for managing the whole shared disk on the basis of the notified metadata, and records this updated content in the log data within the shared disk. Note that the management-oriented subsystem issues the read-only token with respect to the storage area of which management has been transferred to any one of access-oriented subsystem, in response to a data reading request given from other access-oriented subsystem.
In the distributed processing system adopting the exclusive access control method described above, if the management-oriented subsystem falls into a process-down or if the computer falls into a node-down, the access-oriented subsystem becomes incapable of continuing the process, and hence it follows that the whole distributed processing system comes to a system-down with abnormal halt of a higher-order subsystem or application program that requests the access-oriented subsystem for the data access.
The system-down of the whole distributed processing system must be avoided, if it is such a system that data are always accessed from over the world, e.g., Internet service providers.
In the conventional distributed processing system, however, if the management-oriented subsystem fell into the process-down, the system could not be immediately restored. This is because if the management-oriented subsystem falls into the process-down, the access control data to be retained by this management-oriented subsystem are lost. Therefore, the management-oriented subsystem can not know which access-oriented subsystem is writing the data to the shared disk, so that it cannot make any access-oriented subsystems resume the data writing.