In the operation of computer systems and networks, the computer data is often “backed-up”, that is to say, it is copied to a storage medium other than the central computer's storage disk in order to permit the recovery of the data as the data existed at some point in time. This is done for purposes of diagnosis in the event of system failure or inadvertent loss of data.
It is often a standard practice to automatically back-up data on a daily or other periodic basis and store this data on tape or disk.
There are several ways to back-up data for diagnostic and recovery purposes. One way is considered as (i) physical level back-up. The physical level back-up refers to the data as it is stored at specific locations on some physical media, such as a host computer disk.
Another way is (ii) designated logical level back-up. This refers to the data as seen by the user application programs in files or in database tables. Normally, the operating system of the computer will include a file system that does mapping between the physical level and the logical level. On doing the physical level back-up, this would involve making a raw copy from a computer disk to some other storage medium without going through the file system or some other physical to logical interpreter module. Then on the other hand, the back-up using the logical level would involve using such a thing as an interpreter module or some sort of a file system while doing back-up of physical to logical mapping.
A Database Management System consists of a set of tools used to develop and manage a database. The present system utilizes a DMSII which is a Database Management System available on the Unisys Corporation's ClearPath HMP NX, and the Unisys A-Series systems. A background for the Unisys DMSII systems is available in a publication of the Unisys Corporation, Document 8807 6625 000, entitled “Getting Started With DMSII” and published in September, 1997 by the Unisys Corporation. The DMSII Utilities provide database back-up and recovery capability for the entire database or for partial databases. The background operations of the DMSII utility enhancements are published in a Unisys Corporation publication Document 98037/4 and entitled “DMSII Utility Enhancements” published on Mar. 31, 1999.
Database back-ups can be accomplished for “on-line” and “off-line” bases. The on-line back-up will allow users to update data in the database, whereas the off-line back-up disallows all updates to the database. The back-ups can be done to either tapes or disks or any combination of both types of such media.
Database Management Systems are used by many large and small businesses such as airline reservation systems, financial institutions, retail chains, insurance companies, utility companies and government agencies. The present Database Management System (DMS) in its form as DMSII is used to build database structures for items of data according to some appropriate logical model, such as relational, hierarchical, or network. Further, the Database Management System is used to manage the database structures and keep the structures in some stable order while various application programs may be retrieving or changing the data. The present embodiment of DMSII has a data definition language designated as Data And Structure Definition Language (DASDL).
There are various tasks that are performed in database management and these involve (i) monitoring and optimizing database performance; (ii) the use of database control for monitoring multi-program database access; (iii) the function of the data integrity and safety done by integrity checking and preventing access to the same data by multiple applications occurring at the same time; (iv) the function of defining data structures and the data fields within them, including the function of modifying data structures; (v) data access operations and developing an application program to retrieve data or to change data; (vi) the function of data shareability to provide multi-program access without conflicts and provide database definitions to the application program; (vii) in database and data security, to prevent unauthorized database access; (viii) ensuring independence of application programs from certain data changes and preventing the revision of application programs every time a structure changes; (ix) in database and data recovery, performing the resumption of database operations after an interruption; (x) tracking data changes by keeping a record of every change made to the data; (xi) for data change integrity, ensuring that update changes are applied to, or removed from, the database in their entirety; (xii) providing a recent copy of the database as a reserve by backing-up the database and storing copies of audit files and all other database files; (xiii) providing for database scalability by growing or shrinking the database according to the ongoing needs at the time.
The DMSII provides standard software files that perform services and operations for all the databases connected to the system's Enterprise Server. This enables a viewing of a list of all these files on the user terminal.
In the ordinary course of operations, the application program user will submit changes to data or retrieve data while running a particular application program. Then, changes can be made which add, modify and delete data. A Database Administrator (DBA) keeps the database running smoothly and enforces the rules for data integrity and security. Users access the database through a given application program which itself does not access the data directly. Instead, the program interacts with the DMSII software and the database tailored software, which is directed by the access routines of the Data Management System to provide accesses, retrievals and the storage of data in the physical database file.
In regard to access, an application user will access the data in order to (i) make an inquiry to get a Read of data in the database, or (ii) to provide an update by doing a Write to the database thus, adding, deleting or changing data. The access for either purpose contributes to an operation on the database which is called a “transaction”.
A transaction is a sequence of operations grouped by a user program because the operations constitute a single logical change to the database, At the end and finality of the transaction point, the transaction is complete and without error, and it is considered as being committed to the database.
Actual real world data goes into special logical structures that are used by the Data Management System to store data. The database is designed to map categories of data into suitable structures. For example, the real world data would have a character with a structure called a “data set”. An example of this would be a particular persons name. Then, real world data that can serve as an index of a whole data set has a structured name called a “set”. This, for example, might be the social security number of any employee. Then there is data that can serve as an index of a data set under a certain condition, and this is called a “subset”. This might be an employee's work number, for example. Then, there is data about each instance of a particular category. The structure name for this is “data item”. An example of this might be the name and address of the category (person). Then, there is data related to the database as a whole, and this involves a structure called “global data item”. An example of this might be the total number of employees in a company. Once there has been identification of the real-world data which is to be stored in the database, it is then necessary to define that data in relationship to the data structures of the data management system that holds data. When this data is defined within “structures”, then the data management system and the system software programs an application program that can then understand how to make this data accessible for various inquiries and/or changes. This is done with the Data and Structure Definition Language (DASDL).
The Data Management System structures are the building Steps of the Data Management System database. Here, the “data set” has the purpose of storing data pertaining to a data category in a collection of records. A “set” has the purpose of indexing all records in a data set. A “subset” serves the purpose to index some records in a data set according to some given criteria. The “data item” is a structured name which defines a unit of information about a category in a given field (column) of a data set record. A “global data item” serves the purpose of storing a unit of information about the entire database or any of its involved structures. In general discussion about the types of data and the names of data structures, it is often seen that in a relational database, a “data set” is called a “table”. A “set” or “subset” is frequently called an “index”. A “data item” is often called a “field” or a “column”, or is often called by its data name, for example, a project number. “Structures” are made of common file components designated as records and fields.
A record is a group of logically-related data items in a file. Often, a record is called a row. Data items reside in different fields in the records. For example, a record might involve a series of data such as an employee's name, the employee's I.D., the employee's social security number and years of employment. A group of such records would constitute a file.
The operating system which uses the data management system will treat the record as a unit. The system makes data available to users in records and not in individual single items of data. In programming languages, the record is the unit of data that the system reads from or writes to a file in one execution cycle of a Read or Write statement in a program.
If the application program wants to change a data item in a given record, the Data Management System brings a copy of the record from the physical storage over to memory, then enables that data item to be changed, and then writes the changed record back to the file.
A “field” is a consecutive group of bits or bytes within a particular component of a record which will represent a logical piece of data. A field or column is defined by the description of the data item it is to hold. For example, if one field carries the name of an employee, this field in the record could be called the name field.
The “data set” is a physical file, that is to say, a collection of related data records stored on a random-access storage device, such as a disk in which the data resides.
A data set is kept up-to-date in several ways: (i) here, application programs add, change, or delete individual pieces of data or records stored in the data set; (ii) the Database Administrator (DBA) maintains the structure of the data set by keeping the data set within certain maximized limits, by adding, deleting or changing the definition of a data item, creating new sets or subsets, monitoring automatic processes that guard data integrity and creating guard files to enhance the security of the data.
A “set” is a separate stored file that indexes all the records of a single data set. The Data Management System uses sets in order to locate records in a data set. A set has no meaning apart from its related data set. The set structure enables an application program to access all records of a data set in some logical sequence.
A “subset” can be considered identical to a set, except that the subset need not contain a record for every record of the data set. A subset is a file that indexes none, one, several, or all of the records in a data set. The subset structure enables an application program to access only records of a data set that meet a particularly required condition.
For example, an application program may compile a list of people who are “managers”. Thus, it is seen that the database designer created the “manager” subset. Thus, in order to retrieve a record of managers, the data management system can use the smaller file, that is, the subset, to quickly point to the corresponding records in the larger file which is the data set. As with the set, the subset must also be kept up-to-date.
A “data item” is an element of data. In the Data Management System, a data item can also be the field (column) in the database record. For example, the social security number could be considered as a data item in the sample data set designated “person”. The purpose of the data item is to describe the data to be stored. The data item provides the identity—type, size, location, and attributes—of one element of data for a database entity. When an application submits an update to a data item, the Data Management System will accept the update if it corresponds to the definition of a data item. Otherwise, the change is rejected and reported as an exception. The Database Administrator will add, delete or change the data item definitions. There are a number of data items that are used by the Data Management System. These include the type called “alpha-numeric” which include words and characters, names, addresses, dates and titles. Then, there are data items designated as “numeric” which involve integers and decimals with or without signs. Then, there are data items designated as “real” which involve single precision floating point numbers that occupy one word. An example of this would be, for example, an employee's salary. Then, there are data items which are called “Boolean” which involve TRUE and FALSE values.
The “global data item” is a data item, a group item, or a population item that is not part of any data set but still pertains to the database as a whole. Such global data items are stored in one special record called the “global record” in the DASDL declaration which is outside the structured definitions. Sometimes the global record is placed just before the structured definitions in the DASDL file. The global data item has the purpose of holding permanent information about the database as a whole or about a particular data set. It also acts as a place holder for information that can be derived from the database.
One of the most significant options in DASDL (Data And Structure Definition Language) is that it is possible to define the database as to whether the database is to be audited. The data management system supports both logging changes to a database (auditing the database) or not logging changes (maintaining an unaudited database). There are advantages in auditing a database since this assures the user that if a database failure occurs, there will be a record of database changes with which one can restore the database to a completely integral state and thus avoid loss of information and corruption of information.
The “audit trail” is a log of changes made to the database. This type of audit trail is somewhat similar to the SUMLOG in the host system which is the history of all system activity except for the fact that the audit trail will record the database update activity only and will consist of separate numbered files. Thus the data management system software can use an audit trail to recover the database from an unusable state, provide restart information to user programs, reconstruct portions of the database that had been lost because of hardware errors, back out aborted transactions and roll back the entire database to a user specified point or rebuild the entire database to a user-specified point.
The “audit file” provides a chronological history of all update database transactions. The audit file is a numbered segment of the database audit trail where the data management system assigns each audit file to have an audit file number (AFN) in the range of 1 to 9999.
Access Routines Program: The data management system controls access to database data with a software program called Access Routines which is a collection of specialized routines that enables many users to access the database all at the same time and ensures that the access is controlled so that accesses do not conflict with one another.Control File: Each active data management system database has a control file. The control file contains the time stamps for the database software and files and the access routines since the access routines use time stamps to check the validity of data. A control file also contains the update levels of the database and the structures since the access routines use update levels to check the validity of data. Further, the control file functions to store audit control information, dynamic database parameters plus other information. It further guards the database from interruption while a process that needs exclusive access to the database goes on to complete its task successfully, such as, for example, a halt/load recovery and/or a reorganization. The control file assures that a database that has been interrupted for any reason is not accessed until the integrity of the database is further guaranteed by the successful completion of the recovery process.I/O Operation: An I/O (Input/Output) operation is one in which the system reads data from or writes data to a file on a peripheral device, such as a disk drive. When there is a failure of a read or a write operation, then this is considered to be an I/O error, which must be handled.Backup: The most important preventive maintenance task which can be performed for a database is to back up the database frequently and to keep the backups for some period of time. To “back up” the database, means to use the data management system DMUTILITY program to make a copy of all or part of the database. It should be noted that “back-up” is not the same as recovery. Recovery is the process that uses the backups to restore the database from some sort of failure. This backup will include a check of the physical integrity of all the database's structures being backed up. A complete database includes providing a reserve copy of all the files pertaining to the database. All the files include not only the database files and the control files (which may change from time to time) but also the DASDL source file, the description file, various tailored files, application programs, and audit files. This enables a user to put the database back in operation quickly in case the current database files should become unavailable or damaged.
Here there is involved the concept of “DUMP.” A DUMP involves either a copy of stored data in which a change has been made since the previous DUMP of that data or a transfer of all or part of the contents of one section of computer storage to another section or to some other output device. The processes used to make a database are called “backing up” and “Dumping.” A backup to tape is called a “Tape DUMP” while a backup to disk is called a “Disk DUMP.”
Often the backing up operation for the database is done by increments. An increment is one of the series of regular consecutive additions, for example, if a database is too large to back up on a daily basis, the operator could create a schedule that backed up a certain number of database files (an increment) each day until the entire database was backed up.
The dump of a database is done to tape or disk depending on what type of storage resources are available. Tapes are most frequently used since they are the less expensive resource than disk. When dumping is done to tape, it is necessary to furnish information common to any disk-to-tape process and this information would include the tape name, the cycle number, the version number, workers, the serial number, compression and non-compression, the density, and the SCRATCHPOOL option.
However, when dumping to disk, it is only necessary to specify the file title for the entire dump and the number of DUMP files into which the system should place the DUMP.
One related art method to which the method of the present invention generally relates is described in U.S. Pat. No. 6,411,969 entitled “Enhanced System And Method For Management Of System Database Utilities”. This related art method is an enhanced method for developing back-up copies of a source database by providing incremental and accumulate dump commands from various multiple-Users which enable a selection of certain files which are identified independently of time-factor for dumping selectively either onto a separate destination medium of disk or tape. A User can determine the block size of words for blocks of data files to be dumped onto the destination medium, thus significantly reducing the number of I/O operations required.
The present invention differs from the above prior cited art in that the prior invention, focuses on methods that create the incremental and accumulated backups. The method of the present invention differs in that it teaches methods to identify and provide full, incremental and accumulated backups to the database recovery process. The methods taught by the method of the present invention provide optimization of recovery process that is not provided by the prior related art patent.
Another related art method to which the method of the present invention generally relates is described in U.S. Pat. No. 5,974,425 entitled “Method And Apparatus For Reapplying Changes To A Database”. This related art method is a method and apparatus for reapplying changes to a database uses a recovery log for recording all changes applied to the database so that the changes may be reapplied to the database during database recovery. Whenever a change is written to a storage device, the recovery log is updated to indicate that the particular change has been written to the storage device. During recovery, the data in the recovery log is sorted by file ID, data block ID, record type and version ID.
The present invention differs from the above prior cited art in that the prior invention, focuses on methods to reapply changes to a database by employing a recovery log. The prior related art method makes no reference to database backups as part of recovery. The method of the present invention differs in that it teaches methods to identify and provide the full, incremental and accumulated backups to the database recovery process in the correct order. The methods taught by the present invention provide optimization of recovery process that is not provided by the prior related art method.