The concept of dynamic workload management of user applications has been known in the art for many years. Fundamentally, it consists of identifying where individual pieces of work should be dispatched-to from a candidate set of servers based upon server state, an appropriate routing algorithm and any relevant affinity data. Candidate sets are usually defined via administrative dialogs related to user applications. Routing algorithms are typically throughput or response time based. Affinities refer to data that the application references in the target server, which needs to be subsequently accessed, thus impeding the ability to dynamically route work whilst the affinity is active.
Distributing applications has the advantage that, should an individual server fail, only those currently executing tasks on that server are affected; user tasks on other servers are unaffected, thus reducing the impact of the failure on application availability. Users executing work on the failed server must manually restart their transactions, and the transactions must be routed to another (active) candidate for the application to execute.
Such solutions provide increased availability of applications to end users whilst the availability of the underlying individual servers may change. They also can improve throughput and application response times smoothing the effects across multiple address spaces.
One implementation of a dynamic workload management system, for IBM's CICS® Transaction Server, is provided by the CICSPlex SM Workload Management (WLM) component. This component is invoked by CICS at suitable points in processing requests to identify the target region for the request, and to manage application affinities. (CICS is a trademark of IBM Corporation, registered in many jurisdictions).
Dynamic workload distribution schemes can further be complicated by the effects of parts of the work being included in what is known in the art of transaction processing as a Unit Of Work (UOW). The concept of transactional units of work is well known to those of ordinary skill in the transaction processing art, and thus needs no further description here. Examples considered in the present description relate to IBM's DB2® database records and IBM's VSAM records accessed in the target regions, but it will be clear to one of ordinary skill in the art that the same considerations apply to records or other data controlled using other resource management systems that are capable of interacting with a transaction processing system. (DB2 is a trademark of IBM Corporation, registered in many jurisdictions).
When a unit of work is distributed across multiple servers, a recovery manager must make a note of any server within the unit of work that has performed recoverable work on its behalf. This information is used at successful completion (syncpoint commit) to commit those pieces of work in the servers, or to back out the changes (syncpoint rollback) should the application request it, should the task terminate abnormally (abend), or should the region terminate abnormally (at region restart). As is known to those of skill in the art, the more distributed the unit of work, the more processing is required and the greater is the risk of a component not being available. It is further well known that multiple requests for the same record within a unit of work must be within the same server region, otherwise a deadlock occurs, and that applications that will request the same record outside that unit of work, when a retained lock exists (caused by either a task abend or a server abend), will not be able to execute.
Also, once a target region has been selected for a piece of work with a declared affinity, the affinity is bound and subsequent requests will be routed to that region, regardless of whether any recoverable work has been done in the target region. This is particularly disadvantageous to processing efficiency when the target region rejects the request, unnecessarily reducing application availability, and causing unnecessary backouts when a chain of distributed program links (DPLs) is involved and one program in the chain abends.
It is desirable to address these shortcomings of known transaction processing systems wherein the conflicting needs for transactional control of dynamically-routed work requests and affinity management of the work requests cannot be reconciled other than partially, and by means of expensive and potentially error-prone application and system redesign.