1. Field of the Invention
The present invention relates to a method of logging in a database management system (DBMS), and more particularly, to a method and apparatus for logging by which when a DB is updated and log records are generated, the log size can be reduced by using log entries.
2. Description of the Related Art
The purpose of using a database management system (DBMS) is to systematically manage data, thereby more easily developing application programs, and to safely maintain and manage data under any circumstances. In general, the DBMS performs logging in relation to data changes in order to guarantee this stability (durability) of a database (DB).
Logging is a basic function of a DBMS, in which inserting, deleting or updating of data occurs, and is recorded in a stable storage device, such as a disk drive, in order to allow restoration of a previous DB state by using the logged information in an exceptional situation.
In the DBMS, the concept of durability is one of the important characteristics of transaction processing, which include atomicity, consistency, isolation, and durability (ACID). Durability means that if a transaction is successfully completed, it should be guaranteed that the result of processing the transaction is reflected in a DB, even if a system error occurs. Generally, the DBMS records the contents of state changes occurring in the DB when transactions are processed as a log, and stores this log in a stable storage medium such as a disk. Accordingly, the state changes of the DB are recorded in the log, and the log supports consistent maintenance of the states of the DB. In its simplest form, all contents changed by transactions are recorded in log files on a disk; however, when each operation of the transactions is accompanied by disk input and output operations, the performance of the DBMS is greatly reduced.
FIG. 1 is a schematic diagram illustrating a logging process of a DBMS according to conventional technology.
A DB 121 includes both a data file 121A and a log file 121B in predetermined areas of a disk drive 120 that is a permanent storage medium. As an update is performed by a transaction due to the execution of an application program, a related data file is loaded in units of pages into a buffer 113 of a memory 110. If an update of an area 111A corresponding to the transaction of a data page is performed on the memory 110, a log record 112B corresponding to the update is written in a log page 112. The updated data page and the generated log page are stored in the data file 121A and the log file 121B, respectively, of the disk drive 120, in accordance with a write ahead log (WAL) protocol.
The WAL protocol is a procedure for first storing a log page in a disk, and then storing a data page in order to remove errors that occur when changed contents of a transaction that is not completed is stored in the disk. According to the WAL protocol, when a system is re-executed, it can be safely restored to a state before an exception occurred.
FIG. 2 is a diagram illustrating a data structure of a log record format according to conventional technology.
The log record, including update information, is formed with a plurality of fields as illustrated in FIG. 2.
A previous log sequence number (LSN) is the LSN of a previous log record generated by a predetermined transaction. In other words, the LSN is the identification number (ID) of a log record and indicates a location in which the log record is recorded in a log page. Accordingly, the LSN comprises the number of a log page and an offset into the log page.
Besides the LSN, the log record includes a transaction ID, a type field indicating the type of the log record, a page ID indicating the number of an updated data page, the length of updated data, an offset into the updated data page, and a before-image and an after-image, corresponding to images from before and after an update, respectively.
FIGS. 3A and 3B illustrate log records generated when updates are performed according to conventional technology.
Referring to FIG. 3A, it can be determined that two update operations have occurred in a data page 1 310. First, data “name” 311 positioned at offset 10 of the data page 1 310 is changed to “kate” 311A, and data “0000” 312 positioned at offset 31 is changed to “0021” 312A.
Here, from a log record 1 (LR1) 320, it can be determined that the value “name”, which is the image before the update, was changed to “kate”, which is the image after the update. From a log record 2 (LR2) 330, it can be determined that the value “0000” was changed to “0021”. When each of these log records 320 and 330 is generated, it is at the same time recorded in a log page.
If updates are continuously performed in this manner, the number of log records to be written in the log page increases in proportion to the frequency of updates.
FIG. 3B illustrates this. Referring to FIG. 3B, by following the log records 340 and 350 described above with reference to FIG. 3A, log records 360 and 370 for changes from “kate” to “john” and from “0021” to “0701” are generated, and it can be determined that log records 380 are subsequently continuously generated, due to other update operations.
In this way, the log size arising from the logging process may become several times to hundreds of times the size of the data actually updated. The increase in the log size causes disk inputs and/or outputs, thereby lowering the speed of update operations, and reducing the remaining space available on the disk. Also, when a recovery operation is performed due to the occurrence of an exceptional situation, the large size of the log data to be read from disk and processed increases the time required for recovery.