1. Field of the Invention
This invention generally relates to digital data processing systems adapted for simultaneous, diverse uses such as on-line transaction application or other priority processing applications and decision support system, backup and other applications that characterize data base management system operations.
2. Description of Related Art
Computer implemented data base management systems are exemplary of systems that operate with what can become two antithetical considerations, namely: (1) maintaining the integrity of the data on the system and (2) maintaining maximum availability of the data on the system. That is, in prior art systems backup operations to preserve data integrity and normal operations for using the data base were mutually exclusive operations. The considerations of data integrity and availability become antithetical when a backup operation interferes with normal operations or when normal operations, due their priority, prevent a timely backup. These conflicts become more prevalent because as the size of data bases increases the time required to complete a conventional backup operation increases. Yet it remains an ultimate goal to have continuous availability of the data base for normal operations.
The maintenance of data integrity in such systems originally involved making copies of the data on the same or other storage devices such as disk drives or on other media such as magnetic tape to provide an historical backup. Typically, however, these systems required all other operations in the data processing system to terminate while the backup was underway. More recently disk redundancy has evolved as an alternative or complement to historical backups. Generally speaking, in a redundant system two storage devices, such as disk storage devices, store data in a form that enables the data to be recovered if one storage device becomes disabled. In a basic approach, a first disk storage device stores the data and a second disk storage device stores a mirror image of that data. Whenever a transfer is made to the first disk storage device, the data transfers to the second disk storage device essentially simultaneously. Typically separate controllers and paths interconnect the two disk storage devices to the remainder of the computer system.
While mirroring provides one type of redundancy, the procedures for obtaining historical backups still involves the transfer of data to a backup medium, such as magnetic tape. As previously indicated, in the past the backup operation has excluded the operation of other applications or programs. However, several systems have been proposed for providing concurrent backups. For example, U.S. Pat. No. 5,212,784 to Sparks discloses an automated concurrent data backup system in which a Central Processing Unit (CPU) transfers data to and from storage devices through a primary controller. The primary controller connects through first and second independent buses to first and second mirrored storage devices respectively (i.e., a primary, or mirrored device and a secondary or mirroring data storage device). A backup controller and device connect to the secondary storage device through its bus. Normally the primary controller writes data to both the primary and secondary data storage devices. The CPU initiates a backup through the primary controller. In response the primary controller then writes only to the primary data storage device and enables the backup controller to take control of the second bus and transfer data from the secondary data storage device to the backup media. After a backup operation is completed, the primary controller resynchronizes the storage devices by updating any changes that occurred to the primary data storage device while the backup operation was underway. Examples are also disclosed in which the primary controller connects to three and four storage devices that enable the system to operate with redundancy by mirroring two storage devices while the backup occurs with a third storage device.
U.S. Pat. Nos. 5,241,668 and 5,241,670 to Eastridge et al. disclose different aspects of concurrent backup procedures. In both systems a request for a backup copy designates a portion of the stored data called a data set. For example, if the data storage devices contain a plurality of discrete data bases, a data set could include files associated with a corresponding data base. In a normal operation, the application is suspended to allow the generation of an address concordance for the designated data sets. Execution of the application then resumes. A resource manager is established to manage all input and output functions between the storage sub-systems and associated memory and temporary memory. The backup copy is formed on a scheduled and opportunistic basis by copying the designated data sets from the storage sub-systems and updating the address concordance in response to the copying. Application updates are processed during formation of the backup copy by buffering the updates, copying the affected uncopied designated data sets to a storage sub-system memory, updating the address concordance in response to the copying, and processing the updates. The designated data sets can also copy to the temporary storage memory if the number of designated data sets exceeds some threshold. The designated sets are also copied to an alternate memory from the storage sub-system, storage sub-system memory and temporary host memory utilizing the resource manager and the altered address concordance to create a specified order backup copy of the designated data sub-sets from the copied portions of the designated sub-sets without user intervention.
If an abnormal event occurs requiring termination of the backup, a status indication is entered into activity tables associated with the plurality of storage sub-systems and devices in response to the initiation of the backup session. If an external condition exists that requires the backup to be interrupted, the backup copy session terminates and indications within the activity tables are reviewed to determine the status of the backup if a reset notification is raised by a storage sub-system. This enables the track extents which are active for a volume associated with a particular session to be determined. A comparison is then made between the track events which are active and volume and track extents information associated with a physical session identification. If a match exists between the track extents which are active and the volume of and track extent information associated with a physical session identification, the backup session resumes. If the match does not exist, the backup terminates.
U.S. Pat. No. 5,473,776 to Nosaki et al. discloses a concurrent backup operation in a computer system having a central processing unit and a multiple memory constituted by a plurality of memory devices for on-line storing data processed by tasks of the central processing unit. A data backup memory is provided for saving data of the multiple memory. The central processing unit performs parallel processing of user tasks and a maintenance task. The user tasks include those that write currently processed data into the multiple memory. The maintenance task stops any updating of memory devices as a part of the multiple memory and saves the data to a data backup memory.
Each of the foregoing references does disclose an approach for performing backup operations concurrently with the execution of applications programs in a computer system. However, in each, the system operates in the environment of a single computer system under common control. For example, in the Sparks patent the CPU connects through a primary controller to the first and second memories and to the backup controller. The Eastridge et al. and the Nosaki et al. patent references disclose systems in which the execution of applications programs is also involved in the backup operation. Further while these references disclose systems for concurrent backup operations, they do not disclose or suggest any procedures for enabling the simultaneous processing of common data by different applications, such On Line Transaction Processing (OLTP) applications and Decision Support System (DSS) applications.
More recently the concept of redundancy has come to include remote data facilities. A computer system with a remote data facility will include a first data processing system with disk storage at as a local site facility and one or more duplicate data processing systems at one or more physically remote locations that operate as one or more mirrors of the data collection in the first system. The physical separation can be measured in any range between meters and hundreds or even thousands of kilometers. In whatever form, the remote data facility provides data integrity with respect to any system errors produced by power failures, equipment failures and the like.
Storage facilities using redundancy including remote data facilities have become repositories for large data bases that also are dynamic entities. They are subject to rapid change as for example in banking systems by bank teller and automatic teller machine (ATM) entries or by requests for passenger tickets in airline reservation systems. In many data base systems OLTP applications maintain the data base in a current state while DSS or query applications enable individuals to obtain reports based upon the contents of the data base.
In early systems the OLTP and DSS applications ran on a mutually exclusive basis. That is, no DSS applications could run while OLTP applications were being processed. Conversely no OLTP application processing could occur while the DSS applications were in use. Certain levels of data integrity were provided to assure the validity of entry data in such systems. For example, U.S. Pat. No. 5,450,577 to Lai et al. discloses a high capacity transaction system in which integrity is assured while transaction processing is underway. In this particular approach, a system receives events from an event generator and stores the raw events to disk, the raw events corresponding, for example, to different data entries for a particular record. Structural information relating events to transactions is not stored on disk. This provides data integrity during the construction of raw events to form a transaction or record to be posted to the data base.
Referring to the issue of availability, the increase in the number of transactions posted to such data bases and the need for twenty-four hour transaction processing particularly introduced by the sheer number of transactions being processed and worldwide access has lead to a ultimate goal of continuous availability for processing OLTP applications. It is no longer acceptable to interrupt the process of OLTP applications for purposes of processing DSS applications. Yet, if this requirement were strictly construed, it would never be possible to obtain queries, so the data base would, in effect, be useless. Consequently steps have been taken to maximize the availability of a system for processing OLTP or other priority applications while still permitting the processing of DSS applications on a timely basis.
U.S. Pat. No. 5,317,731 to Dias et al. discloses one approach for providing separate processes or on-line transaction application and decision support system application processing. In this patent on-line transaction and decision support system application processing are referred to as transaction and query processing respectively. Dias et al. utilize an intelligent page store for providing concurrent and consistent access by a functionally separate transaction entity and a query entity to a shared data base while maintaining a single physical copy of most of the data. The intelligent page store contains shared disk storage. An intelligent versioning mechanism allows simultaneous access by a transaction processor and a query processor. The transaction processor is presented current data while the query processor is presented a recent and consistent version of the data. In this particular approach both the transaction and query processors operate independently of each other and are separately optimized. However, the query processor apparently can only read data from the intelligent page store.
U.S. Pat. No. 5,495,601 to Narang et al. discloses an alternative approach for separating on-line transaction and device systems support application processing. In this particular embodiment transactions directly effect data at a series of disks through a controller. When a decision support application is processed, a host produces a series of parameters that pass to the controller and represent the selection criteria for records in a data base. The controller then operates on the data base independently of the host to identify those records satisfying the criteria. While this occurs, the host temporarily stores any updates due to transactions in a buffer pool. The decision support system seems to be limited to read-only operations.
U.S. Pat. No. 5,504,888 (1996) to Iwamoto et al. discloses a file updating system employing the temporary connection and disconnection of buffer storage to extended storage. Extended storage becomes available for dedicated use by a batch process that updates data and eliminates contention between resources with an on-line process that is a normally run application that accesses the data on a file disk. During normal operations, during which the batch processing is inactive, read and write transfers requested by the on-line process establish a data path from an on-line process buffer through an extended storage unit to a file disk. When batch processing is to occur this path is terminated; and the on-line process thereafter can only read data from the file disk. The batch process can receive data as needed from the file disk through the extended storage unit but writes data or transfers data updates only to the extended storage unit. When batch processing has been completed, a data path is established from the extended storage unit to the on-line process buffer, and the updated data stored in the extended storage unit transfers to the file disk. This particular approach is adapted for data processing systems particularly involving data bases which are relatively static in content, such that periodic, batch-processed updates are satisfactory. The fact that the on-line process can only perform reading operations while the batch process is active limits the use of this methodology. Such an approach is not readily adapted for use in a data processing system as used in banking, reservations or other systems in which the data base changes dynamically.
U.S. Pat. No. 5,592,660 to Yokota et al. discloses a data base management system that performs retrieval process and updating process operations alternatively. The data processing system in this patent is disclosed in terms of a transaction data base system processing device with a data base storage device and a decision support data base system that includes two decision data base storage devices. Each interval during which the transaction data base system updates a record in the transaction data base is a predetermined time interval. A delayed updating device in the decision support data base system receives a log created by the change to the transaction data base during each predetermined time interval. At each predetermined time interval, the delayed updating device alternatively supplies both the log received at a current predetermined time interval and the log received immediately preceding the current predetermined time interval to a first data base storage device and to a second data base storage device. A retrieving device executes a retrieving process for the second decision data base stored in the second data base storage device when the delayed updating device supplies both logs to the first data base storage device. The retrieving device also executes a retrieving process for the first decision data base stored in the first data base storage device when the delayed updating device supplies both logs to the second data base storage device. In essence, the retrieval job processing accesses one or the other of the two data base storage devices associated with the decision support data base system while the delayed updating part operates with the other of those storage devices.
Most of the foregoing references do not provide alternates for maximizing the availability of a system for processing OLTP or like priority applications, nor do they effect a complete segregation of those processes. Most of the last four cited references fail to provide any suggestions for procedures that will provide data redundancy. Moreover the processing of decision support system or equivalent applications is limited to read only operations. This can limit range of procedures that decision support system applications can perform.
While the Yokota et al. patent discloses separate data processing systems for the transaction job, or OLTP, processing and for the decision support system processes or applications, a data processing system operating in accordance with the disclosure seems to require disk storage capacity of three times the capacity required for storing one copy of the data base. That is, it appears that the primary copy of the data base is stored in one disk for access by the transaction job processing part (i.e., the OLTP processing application). Two additional copies are required for the decision support database system. Still additional storage may be required for maintaining update logs in the transaction job database system. Provisions must be made to transfer the update log information from the transaction job database system to the decision support database system. These transfers will require data processor resources. In many applications, the allocation of such resources from the OLTP processing computer system can introduce intolerable delays in the rate of transaction processing. In addition all data seems to transfer only to the decision support database system. There appears to be no way to transfer data from the decision database system to the transaction job database system.