The present invention relates to an improvement in performance and reliability of a file system, and particularly to a file data multiplexing method adapted for a high-speed file system in a data processing system to which a large number of disk drive units are connected.
A first conventional technique with respect to a conventional file data multiplexing method has been described in Jim Gray, "Disk Shadowing", Proceedings of the 14th VLDB Conference, 1988, pp. 331-338.
The first conventional technique discloses a method in which mirroring is performed by magnetic disk drive units. In basic mirroring, two disk drive units make a pair to hold one and the same data. Although multiplexing using three or more disk drive units is specifically called "shadowing" in the aforementioned literature, such multiplexing using three or more disk drive units is now called "mirroring". The operation of mirroring will be described below. One and the same data is written in one and the same address in each of a plurality of disk drive units to thereby perform multiplexing at the time of writing of data. The data is read from one of the disk drive units at the time of reading the data.
Accordingly, because a plurality of reading operations can be carried out in parallel compared with the case where data is stored in a single disk drive unit, I/O throughput which is the number of I/O processings per unit time is improved. Further, because a disk, drive unit shortest in seek time is selected so that data can be read from the disk drive unit, the I/O time required for "read" can be shortened. The "write" performance as to I/O throughput and I/O time is lowered because of overhead imposed on starting of I/O processings for a plurality of disk drive units and overhead imposed on waiting for completion of writing in the all disk drive units. The average performance is however improved compared with the case of a single disk drive unit even in the case where the number of executed "read" operations is equal to the number of executed "write" operations, because the lowering of the "write" performance is sufficiently small compared with the improvement of the "read" performance.
Another advantage of mirroring is in an improvement both in reliability and in availability. Even in the case where a failure occurs in one disk drive unit, copied data stored in another disk drive unit can be accessed because a plurality of copies of the same data are provided. Accordingly, data is not lost. The designation of mirroring is decided by a system manager and set through a system command given from a console by the system manager. Although the mirroring is, in most cases, realized by a combination of hardware such as a disk controller and software for controlling the disk controller, there is an example in which the mirroring is realized by an operating system (OS).
A second conventional technique with respect to a data multiplexing method in which data multiplexing is performed by magnetic disk areas, is described in "Guide to OSF/1: A Technical Synopsis" (1991), pp. 7-1-7-10, O'Reilly & Associates, Inc.
In the second conventional technique, mirroring is performed not by disk drive units but by disk areas on the basis of the concept "logical volume". One disk is divided into fixed-length areas called "extent", so that one logical extent corresponds to one physical extent or a plurality of physical extents. Any suitable physical disk drive unit may be selected for the physical extent to be related to the logical extent. Further, physical extents of different physical disk drive units may be related to one logical extent. One virtual disk drive unit formed by collection of logical extents is called "logical volume of disk". The case where one logical extent corresponds to one physical extent or a plurality of physical extents is mirroring. The operation of mirroring will be described below. One and the same data is written in the same relative addresses of the extent in a plurality of physical extents corresponding to a logical extent so that data multiplexing is performed at the time of writing of data into logical extents. The data is read from one of the physical extents at the time of reading the data.
Accordingly, partial mirroring of a disk drive unit can be provided by means of "logical volume". The designation of mirroring is decided by a system manager and set through a system command given from a console by the system manager in the same manner as in the first conventional technique. The mirroring using "logical volume" has not only the advantage that improvement in performance, reliability and availability can be attained in the same manner as in the first conventional technique but also another advantage that the disk capacity required for multiplexing can be reduced because only the important area of the disk drive unit can be subjected to multiplexing. In the second conventional technique, mirroring is realized by an OS.
A third conventional technique in which data multiplexing is performed not by magnetic disk drive units but by files is described in "Operating System Concept Third Edition", Addison-Wesley Publishing Company, 1991, pp. 507-508.
A data multiplexing method in which data multiplexing is performed by files in the third conventional technique is called "replication". The replication is used for an improvement in reliability and applicability of a distributed system. A file called "replica" for holding the same data as that of a master file is stored in a magnetic disk drive unit of a data processing system different from a data processing system containing the master file. In the case where a failure occurs in a computer system in which a master file is existent, an access-disabled state can be avoided by accessing the replica. At the time of writing data into files, not only data is written in a master file but also the same data is written in a corresponding replica or a plurality of corresponding replicas, so that data multiplexing is performed. At the time of reading data, the data is read from the master file or an arbitrary replica. Multiple issuing "write" requests to the replicas causes large overhead in the distributed system. Therefore, as a consistency control method for mirroring updating of a master file in a replica, there is a low consistency method in which updated data collected at intervals of a predetermined time are written in a replica, other than the aforementioned, high consistency method in which writing is performed simultaneously with the updating of a master file. The designation of a replica is made through a system command given by a system manager, but a replica provided as a cache in a data processing system requesting access to a file is automatically generated by the OS at the time of access requesting. This is called "cache replication".