This application is based upon and claims the benefit of priority from the prior Japanese Patent Application No. 2000-053814, filed Feb. 29, 2000, the entire contents of which are incorporated herein by reference.
The present invention relates to a disk control mechanism capable of increasing a speed of a random write with respect to a disk apparatus (a disk memory apparatus) represented by a magnetic disk apparatus in a computer system.
In recent years, there is proposed a log structured file system (LFS) as described in, for example, Jpn. Pat. Appln. KOKAI Publication No. 11-53235 as a technique for increasing the speed of the random write with respect to a disk apparatus in a computer system.
The principle of the log structured file system (hereinafter referred to as LFS) is intended to realize an increase in the speed of the disk write by converting the small block random write to the large block sequential write on the side of the disk control mechanism on the basis of a presupposition peculiar to the disk apparatus to the effect that a large block sequential write of the disk is extremely high in speed as compared with a small block random write. Specifically, data to be written comprising a plurality of small blocks is collected irrespective of its original write position, and is recorded on a disk as a sequential log of one large block with the result that the disk-write speed is increased.
In the case of the application of the LFS, it is necessary to hold information showing a correspondence relationship from the position where a plurality of small blocks of data to be written are supposed to be originally written, namely, the original write position (hereinafter referred to as the original position) intended on the side of a computer to correspond to a position on a log (hereinafter referred to as a log position) and information which stands in a reverse relationship to information showing a correspondence relationship from the log position to the original position. In the foregoing explanation, the former correspondence relationship information is referred to as a forward index while the latter correspondence relationship information is referred to as a reverse index. Furthermore, the both indices are referred to as indices.
The indices are generally held in the computer. As a consequence, when the indices become precarious in the case where the computer comes to a sudden halt because of trouble, the data on the disk unreliable.
Therefore, in the conventional computer system to which the LFS is applied, the indices are held on a dedicated non-volatile memory, for example, an NVRAM (non-volatile random access memory) to provide endurance against damage.
As has been described above, the LFS as disclosed in Jpn. Pat. Appln. KOKAI Publication No. 11-53235 is suitable for realizing an increase in the speed on the random write access to the disk apparatus.
However, the LFS has a problem as described below, and it is important to improve this problem in the practice thereof.
A first problem is that the performance is very likely to be extremely deteriorated with respect to a large block sequential read. The cause thereof is that the large block sequential read has been converted into a small block sequential read as a compensation for the conversion of the small block random write into the large block sequential write. In other words, there is a possibility that the data may be arranged at random at the log position even in the region where the data is continuous at the original position.
A second problem is generated by the application of the LFS to a shared disk in the fail-over system. The fail-over system is a system in which a plurality of computers share the disk apparatus so that even when any computer is damaged, another computer can inherit the processing from the damaged computer. Such a system is referred to as a high availability (HA) system. In this fail-over system, when the primary computer comes to a sudden halt because of trouble, the secondary computer inherits the processing from the primary computer. At this time, the data is handed over to the secondary computer through the shared disk. However, in the LFS, since the index or the like is provided on the non-volatile memory (NVRAM), the data cannot be handed over through the shared disk. In other words, in order to hand over the data, the shared non-volatile memory (NVRAM) becomes necessary.
The present invention has been made in view of the circumstances. An object of the present invention is to provide a disk control mechanism which deals with the rearrangement of data in consideration of the original position of the data the deterioration in the performance of the large block sequential read by the application of the LFS (log structured file system for an increase in the speed of the random write.
Another object of the present invention is to provide a disk control mechanism which eliminates the need of a non-volatile memory and facilitates the inheritance of the indices at the fail-over system by effectively making use of the disk region for the preservation of the indices necessary for the increase in the speed of the random write.
According to a first aspect of the present invention, there is provided a disk control mechanism to which an LFS (log structured file system) is applied wherein data designated by a plurality of disk write requests given from the upper position is collected to be continuously stored in a data block unit having a predetermined size in a region (a log region) which is secured separately from a region (an original region) which can be designated from an upper region on the disk apparatus (namely, which can be seen from the upper position) characterized by comprising rearrangement means for repeating an operation of rearranging the oldest effective data block on the log region at a position on the original region where the data block are supposed to be originally written.
In such a structure, it becomes possible to prevent a reduction in an access performance of a large block sequential read by the application of the LFS because it becomes possible to continuously read the data block from the original region with respect to the read request to a group of data blocks arranged in a continuous region in the original region even if the data blocks are arranged at random at the log position by the rearrangement of the rearrangement means while attempting to increase the speed of a parallel access of the random write (a small block random write) by the application of the LFS.
According to a second aspect of the present invention, there is provided a disk control mechanism to which the LFS is applied, the mechanism being characterized by comprising:
recovery processing means for recovering a forward index in a forward index storage region of a data block secured on a volatile memory, the forward index indicating a correspondence relationship between a position of the data block on the log region and a position on the original region to which the data block is originally written from the reverse index by reading the reverse index from a control block on the log region, namely, at the time of the start-up thereof while allowing the write processing means for collecting data designated by a plurality of disk write requests given from the host device to be provided with a function of adding the control block including the reverse index showing a position on the original region where each of the data blocks which is continuously stored are supposed to be originally written and storing in the log region; and
read processing means for judging which of the log region or the original region the data block designated by the read request is stored by referring to the forward index storage region on the basis of the read request when the read request is given from the host device thereby reading the data block from either the log region or the original region on the basis of the judgment result.
In such a structure, an attempt can be made to increase the speed of a parallel access of a random write (a small block random write) by the application of the LFS. Besides, in the structure, the forward index of each of the data blocks can be recovered on the basis of the reverse index in the control block stored in the log region in addition to the data block queue at the time of the start-up (at the time of rise) despite the fact that a volatile memory is used instead of a non-volatile memory such as an NVRAM to hold the forward index of the data block stored in the log region. In other words, even when information on the forward index storage region is temporarily lost because of the generation of the power source shut-off, the information can be recovered at the time of start-up. Thus, the breakage or the loss of the forward index data can be prevented and endurance against trouble can be realized without using the non-volatile memory. Here, when the control block and the data block are set to the same size, the control block can be easily accessed.
According to a third aspect of the present invention, there is provided a disk control mechanism which is characterized by adding to the disk control mechanism according to a second aspect of the invention:
log region control means for controlling a log control region secured on a disk apparatus for conserving a reverse index of each of the data block stored in the log region, the means for conserving the memorized reverse index from a position following the reverse index which has been already preserved at the previous check point for each of the predetermined check point by memorizing on a volatile memory the reverse index included in the control block which is stored in the log region after the previous check point; and
recovery means which replaces the recovery means which is applied in the disk control mechanism according to the second aspect of the invention, the means having the following function;
the function of reading the reverse index from the log region up to the most recent checkpoint, and, at the same time, reading the reverse index from the control block stored in the log region after the check point to recover the forward index in the forward index storage region on the basis of the reverse index. Here, when the structure of the mechanism is constituted in such a manner that in correspondence to the arrangement of the control block and the data block on the log region, a dummy of the reverse index (data showing the control block and having the same size as the reverse index) is preserved at the position on the log control region, the reverse index can be read at a high speed from the log control region.
In such a structure, since it is only the reverse index of the data block stored in the log region after the most recent check point that can be directly obtained from the log control region and is required to be recovered from the control block in the log region with respect to the reverse index of the data block stored in the log region up to the most recent check point, time required for the recovery of the forward index at the time of the start-up can be further shortened. Incidentally, it goes without saying that the parallel access of the random write can be increased in speed and, at the same time, the endurance against trouble can be realized without using the non-volatile memory.
According to a fourth aspect of the present invention, there is provided a disk control mechanism which is characterized by adding to the disk control mechanism according to the first aspect of the present invention the write processing means, the recovery processing means, and the read processing means which are applied in the disk control mechanism according to the second aspect of the present invention, and, at the same time, by allowing the rearrangement means to be provided with the function of eliminating the forward index of the data block from the forward index storage region at the time of the rearrangement of the data block.
In such a structure, it becomes possible to obtain two effects: an effect which is obtained in the disk control mechanism according to the first aspect of the present invention, and an effect which is obtained with the disk control mechanism according to the second aspect of the present invention.
According to a fifth aspect of the present invention, there is provided a disk control mechanism which is characterized by adding to the disk control mechanism according to the first aspect of the present invention the write processing means, the log region control means, the recovery processing means, and the read processing means which are applied in the disk control mechanism according to the third aspect of the present invention, and, at the same time, by allowing the rearrangement means to be provided with the next function of eliminating the forward index of the data block from the forward index at the time of rearrangement of the data block.
In such a structure, it becomes possible to obtain two effects: the effect which can be obtained with the disk control mechanism according to the first aspect of the present invention and the effect which can be obtained with the disk control mechanism according to the third aspect of the present invention.
According to a sixth aspect of the present invention, there is provided a disk control mechanism which is characterized by adding to the disk control mechanism according to either of the first, the fourth or the fifth aspect of the invention mode setting means for setting the disk apparatus either to a log mode or to a non-log mode upon receipt of a transition setting instruction either to the log mode or to the non-log mode from the host device, the means setting the mode in such a manner that the data designated by the disk write request given from the host device is written in the log region with the addition of the control block in the log mode and the data is written as it is in the original region in the non-log mode.
In such a structure, the transition of the mode from the log mode to the non-log mode, or the transition from the non-log mode to the log mode can be conducted while the data is preserved as it is. Here, the non-log mode is suitable to a backup processing or the like in which the large block access is made. Here, in the period (for example, business hours) when a transaction processing is frequently generated in which the small block access (in particular, the small random write) is made, the log mode is set. When the non-log mode is set (log mode is released) in the period (for example, except for business hours) when the backup processing is made, the disk access speed can be increased at all times.
Here, when the rearrangement means is provided with the following means, namely, batch rearrangement means for rearranging all the effective data blocks on the log region at a position where the data blocks are supposed to be originally written on the original region, the data reading by the read processing means is conducted to the original region at all times immediately after the transition of the non-log mode with the result that the large block sequential read can be conducted at a high speed from the beginning.
According to a seventh aspect of the present invention, there is provided a disk control mechanism which is characterized by adding to the disk control mechanism according to the first, the fourth, or the fifth aspect of the invention mode setting means for setting the disk apparatus to the log mode or to the non-log mode upon receipt of the transition setting instruction to the log mode or the non-log mode from the host device, the means allocating at least the original region of the original region, the log mode and the log control region allocated in the log mode at the time of the transition to the normal mode, dividing the memory region including the original region allocated in the normal mode at the time of the transition to the log mode to allocate a new original region for the log mode, the log region and the log control region, and, at the same time, allowing the rearrangement means to be provided with batch rearrangement means.
In such a structure, the memory region of the disk apparatus can be effectively used and, and can also correspond to an increase and a decrease in (the memory region of) the disk apparatus. However, since there may arise a case in which the data must be discarded with the transition of the mode, such structure may be suitable to a system which is fixed and operated to a mode after the transition, but is not suitable to a system in which the mode is frequently changed over.
Furthermore, by using a plurality of computers having a disk control mechanism according to either the third or the fifth aspect of the invention, the computer system is constituted wherein the disk apparatus is shared by using each one of the plurality of computers. In such a case, even when the computer (the primary computer) which is being operated is damaged or comes to a halt, the forward index is recovered with the recovery processing means of the disk control mechanism at the time of the start-up in a different computer (a secondary computer) which inherits the processing of the computer with the result that index can be easily inherited without using the non-volatile memory and a high availability system (a fail-over system) can be realized.
Incidentally, the present invention which is concerned with the disk control mechanism can be established as an invention which is concerned with a disk control method.
Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out hereinafter.