This invention relates to operating a data processing system and more particularly to recovery and restart of a batch application in a data processing system.
In complex data processing systems such as global banking networks running on mainframe systems, a transaction processing system (such as the CICS® transaction server from IBM) manages the interface between application programs and database records. If an application wishes to access a record (such as a customer's bank account balance) stored on a database, then the transaction processing system mediates the transaction. The transaction processing system recalls the record from the database and places a lock on the specific record so that no other application can access or update that record while it is locked. Read or write data requests originating from the accessing application program and relating to the record are then processed. Once the application program has finished with the record, a syncpoint is issued to the transaction processing system, which results in the lock being removed from the record.
In addition to application programs accessing and updating records via a transaction processing system, batch applications are also used to update records. A batch application is effectively an off-line access of the data stored in the records. For example, a very large number of updates (often numbering into the thousands) are processed by a single batch application. In the context of a banking system, a batch application may relate to a series of over-the-counter transactions that need to be applied to computerized records representing the various bank account details of the customers.
A batch application is an automated procedure that processes all of the updates in the batch. Once a batch application is started, each update is processed in turn. This involves accessing each record to be updated, locking the record and then performing the necessary update. Once all of the records referred to in a batch application have been processed, then a syncpoint can be issued and all of the locked records can be unlocked and therefore made available to other applications.
In the past, the operation of transaction processing systems and batch applications did not interfere with each other, as the transaction processing system could be taken offline, for example, at night, and any and all batch applications could be executed at this time. However, the globalization of markets and organizations has meant that systems such as CICS are expected to be online twenty-four hours a day, as access to data records by application programs is constantly required by many data processing systems. This conflicts with the requirements of a batch application, which needs to lock a large number of records when it is processing a data batch.
In relation to file sharing between CICS and batch applications, at present batch applications running on operating systems such as IBM's z/OS® operating system cannot share records for update with a CICS region running in the same image. Traditionally CICS is taken down, the batch applications are run to update the records and then CICS is restarted. However, there is a need for businesses to be able to run their batch jobs while transaction processing systems, such as CICS, are running so that the transaction processing system is available 24 hours a day. Businesses wish to achieve this without any changes to their batch applications.
The reliability of modern hardware and software, combined with online management tooling and other software enhancements, has made system outages rare events. However, there remains one very significant reason for planned CICS system outages; the need for processes other than CICS to operate on data that is “owned” by CICS. Such processes, traditionally batch applications, but in future potentially web-related Java™ applications, cannot currently operate without temporarily making data unavailable to CICS. This makes entire CICS systems, or major applications, temporarily unavailable to online users. With the transaction rates achievable in modern Internet-driven systems, any outage can result in significant loss of service to key customers and loss of revenue to enterprises.
Today, the time available to run batch applications has become constrained for many customers because of many factors, including data center consolidation, globalization of call centers, changes in working practices and other social and economic changes, for instance Sunday trading.
Users have found partial solutions to this problem, mainly based on tools that minimize the impact of the unavailability of data, either by careful scheduling or by limiting either the scope (number of data sets) of outages or their duration. To date, no solution has attempted to eliminate outages entirely.