A database management system (DBMS) is a software system that facilitates the creation, maintenance, and use of an electronic database. The software system is a suite of programs that typically manage large structured sets of persistent data, offering ad hoc query facilities to many users. The DBMS controls the organization, storage and retrieval of data (fields, records and files) in the database. The DBMS also controls the security and integrity of the database. The DBMS accepts requests for data from an application program and instructs the operating system to transfer appropriate data as requested.
When the DBMS is used, information systems can be changed much more easily as the organization's information requirements change. New categories of data can be added to the database without disruption to the existing system. Data security can prevent unauthorized users from viewing or updating the database. Using passwords, users are allowed access to the entire database or a series of database subsets, called sub-schemas or tablespaces. For example, an employee database can contain all the data about an individual employee, but one group of users may be authorized to view only payroll data, while others are allowed access to only work history and medical data of the employee database. The DBMS can maintain the integrity of the database through locks by not allowing more than one user to update the same record at the same time. The DBMS can keep duplicate records out of the database; for example, no two customers with the same customer numbers (key fields) can be entered into the database.
Query languages and report writers allow users to interactively interrogate the database and analyze its data. If the DBMS provides a way to interactively enter and update the database as well as interrogate it, this capability allows for managing personal databases. However, the DBMS may not leave an audit trail of actions or provide the kinds of controls necessary in a multi-user organization. These controls may only be available when a set of the application programs are customized for each data entry and updating function. For example, a business information system can be made up of subjects (customers, employees, vendors, etc.) and activities (orders, payments, purchases, etc.). Database design is the process of deciding how to organize this data into record types and how the record types will relate to each other.
The DBMS should mirror the organization's data structure and process transactions efficiently. Organizations may use one kind of DBMS for daily transaction processing and then move the detail onto another computer that uses another DBMS better suited for random inquiries and analysis. Overall system design decisions can be performed by data administrators and systems analysts. Detailed database design can be performed by database administrators. Three common organizations are hierarchical databases, network databases, and relational databases. A database management system may provide one, two or all three methods. Inverted lists and other methods can also be used. The most suitable database structure can depend on the application, on the transaction rate, and the number of inquiries made.
Known DBMSs may organize multiple tablespaces and store tables of the database. To recover selected tablespaces in the event of a system crash, a backup image of the database or the tablespace is restored followed by rolling forward through the log files that were created since the backup was taken. Log files contain log records that describe the changes made to the data currently stored in the database. Each log file contains one or more log records that apply to one or more tablespaces. Current recovery protocols either process or preprocess each log file during an operation for recovering the tablespace. However, one disadvantage of these protocols is that only those log records that apply to the tablespace being recovered need be processed. Therefore, processing all potential log files can result in inefficiencies concerning log file access and use. For example, if there was only one transaction that affected the tablespace being recovered, and that transaction existed in the life span of only one log file, all the log files will still be processed. Consequently, much time can be wasted in the current recovery protocols. Regardless of whether the log file contains transactions that are relevant for the tablespace being recovered, that log file will be processed as part of the recovery if was created between the start of the backup being recovered and the point in time to which the recovery is made.
For example, referring to European Patent Application No. 2002/0007363 A1, it describes a system and a method for processing through all log files but filtering the ones it actually plays. This system is required to review all the log files in order to select specific objects to recover. This system can be inefficient and inconvenient; processing time can be wasted when the system cannot skip the processing of log files that do not contain records of interest for the tablespace being recovered.
Referring to U.S. Pat. No. 6,185,577, it describes a system and a method for determining whether a rollback record has already been played. However, this system does not determine if the record needs to be played but assumes that it does. A function is described for storing multiple actions to be played within a single log record. Disadvantageously, this system cannot selectively process log files, which can result in wasting processing time on correlation operations. Furthermore, the system cannot ascertain whether the log file contains anything that needs to be played.
Referring to U.S. Pat. No. 6,182,241, it describes a method for recovering a system that terminated unexpectedly. The recovery operation includes partial processing and postponing the full processing of some non-terminated transactions to a later stage. One disadvantage is that all non-terminated transactions and therefore log records have to be processed eventually. Inconveniently, there is no way to skip processing of any log files or log records of non-terminated transactions. This system can also be inconvenient because it does not recover the subsystems in the database (i.e. tablespaces).
Referring to U.S. Pat. No. 6,178,427, it describes a system and a method for dealing with mirroring log files and then extracting relevant log records from the log files so that only the tablespaces being recovered are processed. However, the log files require processing prior to actual recovery in order to make it possible to skip log records by determining those specific files that may not be needed. This system can be inconvenient because it requires preprocessing of the log file.
Referring to U.S. Pat. No. 6,052,695, it describes a recovery mechanism for a distributed system. All the log files that contain transactions after the failure must be processed. This arrangement can be inconvenient because irrelevant files are not skipped, causing additional processing time.
Thus, there is need for a system and associated method that identify and selectively replay only those log files needed for database recovery. The need for such a system and method has heretofore remained unsatisfied.