1. Field of the Invention
The present invention relates to a buffer management method making use of database reference characteristics in response to a database query request and, more particularly, to a buffer management method and system of a program having a function to manage the data input/output of an external storage device operating on a virtual memory system, to execute the data input/output efficiently by making use of a main memory and an expanded memory.
2. Description of the Prior Art
In the database management system, a data input/output management method for the database stored in the external memory is generally exemplified by the buffer management function using an input/output buffer securing the data on the main memory.
In the prior art, this input/output buffer management method is chained by the LRU (i.e., Least Recently Used) method or the like. According to this method, however, it is necessary and takes a long time to search the whole chain of the I/O buffer serially in case it is to be decided whether or not the data page newly required exists in the I/O buffer.
In this connection, the buffer searching time can be minimized by using the hashing method, as disclosed in Japanese Patent Laid-Open No. 57-169983, for example.
In the relational database management system of the aforementioned database management system, on the other hand, the database is constructed of a relation which is observed in a two-dimensional expression from the user and which is composed of a plurality of tuples (or records). The aforementioned relation is stored in a plurality of data pages having an I/O unit of the database management system, i.e., a physically fixed length. In the relational database management system, generally speaking, the data are stored in the external memory in random locations having no relation to the expression so that the data in the external storage device has to be efficiently transferred to the main memory. In the database processing, moreover, the I/O number is increased if the aforementioned LRU method is used as the management of the I/O buffer, in case the processings of the sequentially accessing type and the randomly accessing type are mixed. More specifically, in case the data are to be sequentially accessed to, a block inputting (i.e., an inputting of plural pages) is customary as the data input from the database. In case, on the other hand, all the data in the relation are updated, the block outputting (i.e., the outputting of plural pages) is more efficient than the inputting/outputting at the unit of one page, as is well known in the art. In the database system of the prior art, however, it is not practiced yet to manage the inputting/outputting at the unit of one page and plural pages for the database in a mixed manner in one I/O buffer pool.
In the relational database management system thus far described, the user need not determine the accessing procedures to the database, because the queries inputted are analysed to generate the internal processing procedures, so that the access characteristics to the database can be explicitly grasped in the database management system.
The I/O buffer management method making use of the aforementioned access characteristics is exemplified in the prior art by the buffer management method based upon the QLSM (i.e., Query Locality Set Model) method. According to this method, the shared buffer is divided for each query request to execute the management in accordance with the replacement algorithm matching the access request to said query.
In the buffer management method based on this QLSM method, when the internal processing procedures are determined by the optimizing processing module for determining the access method to the database by analysing the query, the buffer of a proper size (which will be called the "locality set size") is divided into divided buffers (which will be called the "locality set") so that they may be managed by the replacement algorithm matching the page reference characteristics to the database of the query.
Incidentally, the buffer management method based on the aforementioned QLSM should be referred to the description of the 11-th International Conference on VLDB "An Evaluation of Buffer Management Stratgies for Relational Database Systems" by Hong-Tai Chou et al.
Next, the aforementioned database management system operates on a virtual space according to the virtual memory method or the memory management method of the operating system. In short, the I/O buffer of the database management system is secured on the virtual memory space so that it is to be paged at all times by the operating system. For the I/O buffer of the database management system, therefore, the paging management module of the operating system may execute the I/O processings to the file of the paging external storage device in response to an exceptional page interruption. In other words, while the database management system is executing the buffer management of the I/O buffer, the operating system is also executing the paging management for the I/O buffer of the database management system. In short, the double buffering is executed.
Although the I/O buffer management module of the database management system manages to prepare necessary data on the I/O buffer, the paging management module of the other operating system may have its I/O buffer data paged out from the main memory to the paging file on the external storage device. In this case, when the data of the I/O buffer are referred to, the operating system assigns the main memory to input (i.e., to page in) the page, which is paged out from the paging file on the external storage device, to the assigned main memory. At this time, two I/O operations are executed to affect the performances of the database management system adversely.
This can be carried out from the target of the paging such that the database management system performs the page fixing for the region of the I/O buffer, because the operating system has a function to fix (as will be called the "page fixing") the region of the virtual memory space on the main memory. However, it is premised that the size of the region of the I/O buffer will not exceed the capacity of the main memory.
On the other hand, a method of improving the paging problems is exemplified by providing an expanded memory. This expanded memory is directly connected to the main memory and is composed of the same semiconductor memory elements as those of the main memory so that it is enabled to expand the main memory logically by realizing the data transfer at the unit of a physically fixed page thereby to compensate the shortage of the capacity.
(a) In the prior art, as has been described hereinbefore, if, in the I/O buffer management method of the database management system, an empty buffer on the I/O buffer disappears while a plurality of users are issuing an I/O request to the I/O buffer, the data I/O request from another user finds the status of buffer shortage if there is no page to be accessed to on the I/O buffer. As a result, that I/O request is left waiting, or the page of the buffer to be saved is to be forcibly replaced by the replacement algorithm.
(b) In the buffer management method based upon the QLSM method, on the other hand, the buffer for the locality set size is secured from the I/O buffer by computing the locality set size of the locality set by an optimizing processing module or the like. If, at this time, the buffer for the locality set size cannot be secured because of the buffer shortage, the query execution is delayed by establishing a standby status till the securing becomes possible, or the I/O number is increased by changing the locality set size of the locality set to the securable buffer size. Especially in the relational database management system, in the case of a join operation (i.e., the joining operation of two or more relations) in a query, the locality set size can be computed because the locality set is generated for each relation. Since, however, a plurality of data pages may be repeatedly referred to for the relation to be joined, the locality set suited for the size has to be secured. In the optimizing processing module or the like, however, it is difficult to compute the locality set size accurately. It is not practical for the computed locality set to require a number of buffers. Thus, it is impossible to make the I/O processings sufficiently efficiently.
(c) For the buffer management method based on the QLSM method, on the other hand, the following counter-measures are used. If some query grasps the accessing characteristics for accessing the relation sequentially, the management is accomplished by setting the locality set size at 1 and by using the SB (i.e., Single Buffer) method as the replacement algorithm of the locality set. According to this method, a plurality of buffers are prepared in advance, and an inputting is accomplished at the unit of pages corresponding to the number of the prepared buffers. In other words, no consideration has been taken into the method of pre-read processing method.
(d) In case the block updating is accomplished with sequential reference to the relation, as has been described hereinbefore, the updated pages are written out to the external storage device by the replacement algorithm, when the newly requested page becomes absent in the buffer, and the requested page may be read in. In order to input the new page in this case, the I/O processing, which has been intrinsically unrequired, may take place to cause the deterioration of the performances.
(e) In the database management system, moreover, in case the data are to be transferred from the external storage device to the main memory, the database processing is executed. When a data base machine including the procedure to transferred the processed result to the main memory is to be utilized, the aforementioned block inputting is performed by the database machine, but no consideration is taken into the control of inputting the processed result into the I/O buffer. Nor is taken any consideration into the control of stopping the use of the database machine and switching it into the I/O method of the data by the database management system of the prior art.
(f) Next, the database management system has a processing for generating an intermediate file by the sort merging method as the internal processing. This intermediate file generating processing is accomplished on the I/O buffer. First of all, the following procedures are repeated till the data are read out: the data are read from the database on the external storage device into the I/O buffer and are processed; and the data of one page are written in a buffer on the I/O buffer. The data of the intermediate file thus generated may be outputted to the intermediate result storing file on the external storage device in accordance with the replacement algorithm of the I/O buffer. Next, this intermediate file is re-read so that many I/O processings take place between the I/O buffer and the external storage device till the sort merging processing is ended. Since, however, this intermediate file is not shared, the page inputted by the query processing from another user is swept away by the page replacement algorithm. In case the page is to be referred to again, the buffer to have its page replaced is selected as the inputting buffer, an unnecessary I/O processing occurs such that the page is inputted after the intermediate result in the buffer is outputted before the page inputting to the external storage device. Thus, there arises a problem of the drop of the performances.
(g) Moreover, the database management system outputs the hysteresis information to a hysteresis information outputting buffer to manage it in the management of the hysteresis information. If the buffer is fully occupied by the hysteresis information, the information is outputted to the hysteresis information file on the external storage device. If, at this time, the database management system is troubled during it separation, the database is recovered on the basis of the hysteresis information till a certain instant. At this time, an input processing from the hysteresis information file on the external storage device is caused to acquire the hysteresis information. Thus, there arises a problem that the time period for recovering the database is elongated.
(h) On the other hand, the database management system operates on the virtual memory space, and the I/O buffer to be managed by the database management system also exists on the virtual memory space. What maps this virtual memory space and the main memory is the virtual memory management of the operating system, i.e., the paging management. Without any data on the main memory even if the database management system accesses to the data on the I/O buffer, the data saved in the paging file on the external storage device has to be inputted to the main memory so that it may be restored. On the contrary, the page on the main memory having the data on the I/O buffer is to be paged out by the paging management so that it can be outputted to and saved in the paging file on the external storage device. Since the database management system takes no consideration into such storage hierarchy, there arises a problem that an unnecessary I/O processing is caused by the paging management of the operating system. This problem can be solved by a method of page-fixing the whole I/O buffer of the database management system on the main memory. If, however, the I/O buffer is large-sized, the region to be paged is reduced on the main memory. Thus, there arises a problem that the paging and swapping frequently take place to drop the ability of the whole computer system.