This disclosure relates to network communications where logical audit blocks are created at a source host and transferred to a remote host where the audit trail is used to create and maintain a continuously synchronized remote database backup.
A database such as the Unisys Data Management System II, Extended, is a centralized collection of data placed into one or more files. Multiple application programs can access this data concurrently. Consequently, redundant files are not required for each individual application. Application programs running in batch, time sharing, and reload job entry environments can all access the database concurrently. A database of the present configuration consists of the following major components:
(a) Data sets;
(b) Sets;
(c) Subsets;
(d) Data items;
(e) Global data.
A data set, a set, or a subset, that is not an item of another set is termed disjoint. Structures need not be disjoint, that is to say a hierarchy can exist between the various data sets, sets, and subsets. A data set, a set, or a subset, that is an item in another data set, is said to be embedded. When a database contains embedded structures, a hierarchical file structure results.
A data set is a collection of related data records stored in a file in a random access storage device. A data set is similar to a conventional file. It contains data items and has logical and physical properties similar to files. However, unlike conventional files, data sets can contain other data sets, sets, and subsets.
A set is a structure that allows access to all records of a data set in some logical sequence. The set contains one entry for each record in the data set. Each set entry is an index that locates a data set record. If key items are specified for the set, records in the data set are accessed based upon these keys. Otherwise the records are accessed sequentially. Multiple sets can be declared for a single data set, thereby enabling the data in a data set to be accessed in several different sequences. A subset is similar to a set. Unlike a set, a subset need only refer to selected records in the data set. A data item is a field in a database record used to contain an individual piece of information.
Data items that are not a part of any data set are then called global data items. Global data items generally consist of information such as control totals, hash totals, and populations, which apply to the entire database. All global data items are stored in a single record.
The audit trail is a record of changes made to the database. The audit trail is used to recover automatically the database following a hardware or software failure. The audit trail specification clause describes the physical attributes of the audit trail.
The audit trail, as mentioned, consists of a record of changes to the database. It is only created for audited databases and is used in the various forms of database recovery.
An audit trail specification describes the attributes of the audit trail. The specification is optional. If no specification appears, attributes are assigned by default.
All audited databases must include a xe2x80x9crestartxe2x80x9d data set definition. There is a specialized syntax for specifying the audit trail attributes. These involve area size, area length, block size, buffers, checksum, and sections in addition to whether disk or tape is involved and types of tape being used.
The areas, area size, and area length are involved which indicate that disk or pack files are divided into areas. Areas are only allocated as they are needed. Thus, a potentially large file can be small initially and then grow as needed. The user can control the maximum amount of disk space allocated to a file by using the AREAS and AREASIZE (or the AREALENGTH) options.
AREAS specifies the maximum number of areas to be assigned to the file. The maximum value allowed for this is 1,000.
The user can specify the length of an area using the AREASIZE (or AREALENGTH) option. The default option for AREASIZE is BLOCKS. The default value is 100 blocks.
BLOCKSIZE: The records in the audit trail are normally blocked. The user can control the size of a block using the BLOCKSIZE option. BLOCKSIZE can be specified as one of the following items:
(i) SEGMENTS: The maximum value is 2,184 segments. SEGMENTS can define an audit buffer size that is larger than that defined by either the BYTES or WORDS option.
(ii) WORDS: This is the default option. If a User does not define a BLOCKSIZE, the audit trail will use a default BLOCKSIZE of 900 words. The maximum value here is 4,095 words.
(iii) BYTES: The maximum value allowed here is 24,570 bytes.
A Remote Database Backup or RDB is a database recovery system which can be a key component of a disaster recovery plan since it minimizes the amount of time needed to recover from a loss of database access. The RDB system also minimizes the loss of productivity, minimizes the loss of revenue and minimizes the loss of business, which could occur because of interruptions in the ability to access one""s database. The RDB works in conjunction with the Data Management System II (DMSII) databases plus Structured Query Language Database (SQLDB), the Semantic Information Manager (SIM) database, and the Logic and Information Network Compiler II (LINCII) databases.
The components of the RDB system consist of a database and also a copy of the database. One database is update capable and the other database can be used only for inquiry purposes. The update-capable database is called the primary database. The host on which this database resides is called the primary host. The xe2x80x9ccurrent on-linexe2x80x9d remote database copy, which is called the secondary database, is xe2x80x9cinquiry-capablexe2x80x9d only. The host on which this database resides is called the secondary host. The configuration of the primary and the secondary databases on their separate hosts is called the RDB System. A single host can participate in multiple RDB systems.
The RDB or remote database backup system enables users to maintain a current on-line inquiry-only copy of a database on an enterprise server, which is separate from the enterprise server on which the update-capable database resides. The host locations can be at the same site or at two geographically distant sites. The remote database backup keeps the database copy up-to-date by applying the audit images from the audited database to the database copy. There is a choice of four audit transmission modes which enables one to choose the means of audit transfer between hosts.
In the RDB system, the term xe2x80x9cprimaryxe2x80x9d and the term xe2x80x9csecondaryxe2x80x9d will indicate the intended function of each copy of the database and the host on which it resides.
The primary database has the function for database inquiry and update, while the secondary database has the functionality useful for database inquiry only.
The secondary database cannot be updated by any application programs and the secondary database is modified only by the application of audit images of transactions performed on the primary database.
Since one complete RDB system is made of one database, and includes the secondary database which resides on another host, that is to say the primary database on one host plus one copy of that database.
A host is the system on which a primary or a secondary database resides. A host can function as a primary host in one RDB system and then also concurrently function as a secondary host for another RDB system. Additionally, one host can function as a secondary host (or a primary host) for multiple RDB systems.
When a RDB system is first initialized for a database, then by default, the primary host is the host upon which the database resides. The other host which is defined for that database is designated as a secondary host and it remains a secondary host until a takeover is performed or until the RDB capability is disabled. Both the primary and secondary hosts must have sufficient resources to support the RDB system and its application environment.
As an illustration, it can be seen how the primary database on a system, which is called Host One and the secondary database is applied on a system called Host Two can work together in response to or in anticipation of an interruption on the primary host. In this example, the application normally runs against the primary database in Host One with the RDB transferring audit images to the secondary database. Under normal operation, which is when the audit images are transferred from the primary database to the secondary database without loss of data during transmission due to network or system failure, the example described above works well. However, in the condition that a network or system failure results in the loss of data during transmission from the primary database to the secondary then the secondary database is said to be out of synchronization with the primary database. Hence there is need of a mechanism by which the lost data can be re-transmitted so that the secondary database can be re-synchronized with the primary one.
The object of the instant invention is to provide a sensing and regulation mechanism between a primary host and a secondary host wherein sectioned audit files established as audit blocks are organized for transfer from a primary host through a network communications bus over to a secondary host with the object of eventually using the received audit block files to update a remote database to keep it in synchronization with a database in the primary host.
In order to accomplish this, there is provided a tracker program and mechanism which is made sensitive to the number of audit blocks in the primary waiting to be transferred to the secondary compared with the number of audit blocks actually received in the secondary which will be used to update the secondary database. Due to transmission delays or broken network communication lines, there can develop a very undesirable out of synchronization situation between the audit block data in the primary and the audit block data in the secondary host. Thus the present sensing and regulation mechanism is devoted to sensing this difference gap and regulating it in order to expeditiously provide for a greater synchronization of audit block data between the primary host and the secondary host.
AUDIT TRAIL SYNCHRONIZATION: It is of some importance to decide on what is called audit level synchronization that is desired for the remote database backup system. This involves the question of xe2x80x9chow closely must the backup database match its source database? Or to express it in another fashion, how closely synchronized should the secondary database audit trail be a replicate of the primary database audit trail?xe2x80x9d
MODES OF AUDIT TRANSMISSION: The remote database backup (RDB) system provides four specific audit transmission modes that enable the user to regulate whether the transmission of the audit images is to be automatic or manual; whether the transmission of audit images is to be done as individual audit blocks or entirely whole audit files; whether the transmission of audit images can be interrupted, that is to say, suspended or not; and what is to be the degree of audit trail synchronization between the primary host and the secondary host. The focus of the present invention involves the use of one mode designated as the ABW or Audit Block write mode.
AUDIT BLOCK WRITE (ABW): The secondary audit trail is to be constantly and automatically kept synchronized with the primary database audit trail on a block-by-block basis. The ABW mode enables this type of close synchronization level to occur by (i) handling interruptions to audit transmissions through one of two error handling options; or (ii) initiating a Catch-up process for the audit block transfer whenever the usual synchronization level is disrupted. This invention is devoted to the Catch-up process.
In the RDB utility, the user can specify the time interval between the detection of a need for the Catch-up process and the beginning of that process.
In a system wherein audit files are transferred from a primary source host through a network connection over to a secondary target backup host, it is essential to sense and regulate any disparity between the audit data in the primary host and the secondary host so that the secondary host does not lag too far behind duplicating the audit data that resides in the primary host.
To this end, the present invention has developed a sensing and regulating method in order to maintain a comparative view of the status of sectional audit blocks residing in the primary host and the number of these sectioned audit blocks which have arrived at the secondary host. In this regard, it is necessary to sense just how much lag or latency is involved by which the audit blocks in the secondary host have lagged behind the accumulated audit blocks in the primary host. The present system involves a method by which a tracker program is used to sense any disparity between the audit blocks of the primary and the secondary host and which can be set to regulate the amount of disparity to be allowed between the audit blocks of the primary and secondary host until the tracker sensing mechanism will initiate a speed up program to expedite the transfer of audit blocks from the primary host to the secondary host. Thus the tracker method can be set to sense when a particular number K represents an undesirable amount of disparity between the audit blocks of the primary host and the secondary host thus to initiate another program which will expedite the transfer of the audit block data from the primary host to the secondary host in order to reduce or eliminate any disparity between the audit block files in the secondary host so that they will possibly duplicate in the present moment each of the audit blocks in the primary host.