1. Field of the Invention
Apparatuses and methods consistent with the present invention relate to managing log information in a database management system, and more specifically, to managing log information to minimize the amount of log information on a data page that is newly allocated.
2. Description of the Related Art
The term “log” or “logging” in a database management system generally means to separately store data before being changed in order to prevent data being changed by a process executed by an application program from having an abnormal value due to a system error or the carelessness of a user.
Logging enables the database management system to perform a cancellation or recovery process to return data processed by an application program that as been abnormally shut down to its original state.
FIG. 1 is a block diagram illustrating the configuration of a log information management system according to the related art.
Referring to FIG. 1, a transaction manager 110 records log information using a log manager 130 when transactions start, end, or are cancelled, and maintains information of the transactions being currently executed. A recovery manager 120 verifies log information through the log manager 130 when a specific transaction is cancelled and then returns changed data to its original state. In particular, when the log information management system is abnormally shut down, the recovery manager 120 removes the effects of an unfinished transaction from a database and reflects a modification of a finished transaction to the database, thereby keeping the data accurate. The log manager 130 merges log information changed in the database by other modules and sends the merged log information to a buffer 160 via a buffer manager 150. In addition, the log manager 130 reads out the log information required by the recovery manager 120 from a disk 180 and provides the read log information to the recovery manager (120).
An index/record/catalog manager 140 is a module for managing index, record, and catalog information items, which are main data forming the log information management system. The index/record/catalog manager 140 requests a buffer manager 150 to load necessary data (index, record, and catalog information items) onto the buffer 160, and to read or change a necessary value. In log information management systems, most of the log information is generated by the index/record/catalog manager 140.
The buffer manager 150 manages the buffer 160 and loads pages having log information on the index, record, or catalog stored in the disk 180 onto the buffer 160 or stores the page loaded onto the buffer 160 in the disk 180, at the request of other modules. A storage manager 170 may perform a process of reading pages from the disk 180 or recording data pages on the disk 180.
The buffer 160 is a part of a memory, and is an exclusive space ensured by the log information management system. Log information is stored in the buffer 160, and FIG. 2 shows a log record, which is an example of the log information, according to the related art.
A log record 200 includes a log header 210, a previous data image 220, and an updated data image 230.
The log header 210 includes a log sequence number (LSN), transaction identification information, previous LSN information, page identification information, offset information, and data length information.
LSN is information for identifying a corresponding log record, and the transaction identification information is identification information of a transaction that causes a change indicated by a corresponding log record. The previous LSN information is identification information of a log record that is generated in the transaction indicated by the transaction identification information immediately before a corresponding log. The page identification information indicates the page on which a changing process included in a corresponding log record is performed. The offset information indicates which position of the page identified by the page identification information the change in data occurs on. The data length information is information indicating the size of changed data.
The previous data image 220 indicates a data value before the data is changed, and the updated data image 230 indicates a data value after the data is changed.
Since the buffer manager 150 reads data from the disk 180 in units of pages, the buffer 160 is also divided and managed in units of pages.
The storage manager 170 reads a specific page from the disk 180 and loads the specific page onto the buffer 160, or records the specific page of the buffer 160 on the disk 180, at the request of the buffer manager 150. When a new page is requested, the storage manager 170 allocates a disk page that is not used at that time as a data storage space. When the existing disk page is not used as a data storage space any longer, the storage manager 170 manages the disk page as one of the empty disk pages. The storage manager 170 may further include a page usage management module that manages the usage of pages.
An application 190 may serve as a query processor or a query engine. The application 190 can notify the transaction manager 110 of the start, cancellation, or end of a transaction, and can read or change log information at a boundary between transactions, that is, between the start of a transaction and the cancellation/end thereof through the index/record/catalog manager 140.
Next, the operation of the components shown in FIG. 1 will be described.
First, when the application 190 requests the transaction manager 110 to start a transaction, the transaction manager 110 generates a new transaction and keeps information on the generated transaction until the transaction is finished.
Then, when the application 190 requests the index/record/catalog manager 140 to update the data, the index/record/catalog manager 140 requests the buffer manager 150 to transmit a necessary page, stores the page in the buffer 160, and performs a necessary update process. Whenever each update process is performed, the index/record/catalog manager 140 creates log information on the data before the change and log information on the data after the change and transmits the log information to the log manager 130. The data update process will be described in detail below.
When it is determined that the data update process has been performed without any errors, the application 190 requests the transaction manager 110 to end a transaction. On the other hand, when it is determined that an error occurs in the data update process, the application 190 requests the transaction manager 110 to stop the transaction.
When the transaction ends, the transaction manager 110 instructs the log manager 130 to create log (hereinafter, referred to as “Commit_Log”) information indicating the end of the transaction. Then, the transaction manager 110 requests the log manager 130 to record all log information items including the Commit_Log information on the disk 180. When all of the log information items having LSNs smaller than LSN of Commit_Log are stored in the buffer 160, the log manager 130 requests the buffer manager 150 to record the log information items on a log file of the disk 180.
If the application 190 determines to stop the transaction, the transaction manager 110 requests the recovery manager 120 to cancel the transaction. The recovery manager 120 requests the log manager 130 to transmit log information having a previous data value in order to recover the data values that have been changed in the corresponding transaction. The request is sequentially transmitted to the buffer manager 150 and the storage manager 170 and is then processed.
Then, the recovery manager 120 requests the buffer manager 150 to transmit the data pages to be recovered to the previous values, and the buffer manager 150 reads the requested pages from the buffer 160 by using the storage manager 170. Subsequently, the recovery manager 120 finds the changed portions on the basis of the log information received from the log manager 130 and recovers the previous data values (before image).
After the transaction ends or is cancelled, the transaction manager 110 removes the information of the transaction from a transaction table.
Next, a process performed when the application 190 requests the index/record/catalog manager 140 to update data will be described in detail below.
The index/record/catalog manager 140 identifies the data update on the disk 180 and determines a disk page of the disk 180 to be corrected.
The index/record/catalog manager 140 determines which of update modes the application 190 is requesting. The update modes include a mode of releasing a disk page (a “delete” mode), a mode of recording data on a new page (an “insert” mode), and a mode of changing a data value recorded on the existing page (a “modify” mode).
In the delete mode, when a designated disk page exists in the buffer 160, the buffer manager 150 removes the page from the buffer 160. Then, the buffer manager 150 requests the storage manager 170 to release the designated disk page, and the storage manager 170 puts the corresponding page to a list of unused pages. At that time, since the disk page is changed, the storage manager 170 creates log information on the change and transmits the log information to the log manager 130.
In the insert mode, the buffer manager 150 requests the storage manager 170 to allocate a new page. The storage manager 170 allocates a new page, creates log information on the change of the disk page, and transmits the log information to the log manager 130. Then, the storage manager 170 duplicates the requested disk page on a space of the buffer 160 that is designated by the buffer manager 150. The index/record/catalog manager 140 inserts the data value on the new page, creates log information on the data value, and transmits the log information to the log manager 130.
In the modify mode, the index/record/catalog manager 140 requests the buffer manager 150 to transmit a disk page to be changed. Then, the buffer manager 150 checks whether the disk page is loaded onto the buffer 160. When the page whose data value should be updated exists in the buffer 160, the index/record/catalog manager 140 changes the data value of the page.
However, when the page whose data value should be updated does not exist in the buffer 160, the index/record/catalog manager 140 duplicates the disk page requested by the storage manager 170 on an empty space of the buffer 160 designated by the buffer manager 150. If there is no available empty space, the buffer manager 150 selects a suitable page and downloads the selected page to the disk 180, thereby ensuring an empty space in the buffer 160. Then, the storage manager 170 duplicates the requested disk page to an empty space of the buffer 160 designated by the buffer manager 150. The index/record/catalog manager 140 changes the data value of the corresponding page, creates log information on the changed data value, and transmits the log information to the log manager 130.
Meanwhile, in the database, when a lot of log records are inserted or indexes are created, a large number of pages are newly allocated, and the data values of the allocated pages are changed. However, in the log information management system according to the related art, when all data values are changed, a previous data image (before image) and an updated data image (after image) remain as log information. Therefore, when a certain page is newly allocated as a page for storing index, record, or catalog information and then new data values are written on the entire page, the above-mentioned characteristic causes log information that is twice the size of the newly allocated page to be created. That is, when N data pages are newly allocated and then values are written on the pages, log information corresponding to 2N pages is created. Thus, since all the created log information items should be recorded on a disk before transactions are completed, the performance of the log information management system may deteriorate due to a disk input/output process. In particular, the above may cause serious problems in a system that should perform a process of inserting a large number of records at high speed.
Accordingly, it is necessary to minimize the amount of log information on data pages that are newly allocated and prevent the performance of a system from deteriorating even when a large number of records are inserted or a new index is created.