The present disclosure relates generally to a method and system for providing a message-time-ordering facility, and particularly to a method and system for executing messages in a PARALLEL SYSPLEX, a registered trademark of International Business Machines Corporation (hereafter IBM), environment in a correct time order even when the clocks of the participating systems are not in perfect synchronization.
Large computer systems have evolved over the years from single system uniprocessors, to tightly-coupled multiprocessors, to loosely-coupled configurations, and finally to sysplex configurations (e.g., IBM Sysplex and IBM PARALLEL SYSPLEX). A single system uniprocessor includes a single central processor complex (CPC) made up of a single central processor (CP) and all associated system hardware and software, controlled by a single copy of the operating system. Tightly coupled multiprocessors include a number of CPs added to a CPC that share central storage and a single copy of the operating system. Work is assigned to an available CP by the operating system and can be rerouted to another CP if the first CP fails. A loosely coupled configuration has multiple CPCs (which may be tightly coupled multiprocessors) with separate storage areas, managed by more than one copy of the operating system and connected by channel-to-channel communications.
A sysplex is similar to a loosely coupled configuration, but differs in that it has a standard communication mechanism (e.g., a coupling facility) for communication between application programs located on one or multiple computers. The sysplex is made up of a number of CPCs that collaborate, through specialized hardware and software, to process a work load. PARALLEL SYSPLEX environments may also include workload manger functions to manage the resources through dynamic workload balancing and prioritization according to user-defined criteria. In addition, PARALLEL SYSPLEX environments may include data-sharing capabilities that support simultaneous, multiple-system access to data. An example PARALLEL SYSPLEX environment includes two or more systems connected to a coupling facility by either intersystem-channel (ISC) links or integrated-cluster-bus (ICB) links. In addition, an external time reference (ETR) is connected to the two or more systems via ETR links.
Many PARALLEL SYSPLEX computer applications depend on the correct sequence of transactions (e.g., a stock trading application) being performed by multiple systems. The correct sequencing of transactions between systems requires that the time-of-day (TOD) clocks on the respective systems be within the signalling time (i.e., the time required for the communication between the loosely-coupled systems). Electro-mechanical variations in the systems' clocks allow them to drift apart from each other. The ETR ensures that the clocks are resynchronized at approximately once every second, but during the intervening time a clock may drift by as much as five parts per million (i.e., 5 microseconds). The drift can be caused by a variety of factors including temperature, electricity and age. Thus, the clock offset of any two systems in a PARALLEL SYSPLEX configuration may be as large as ten microseconds when the clocks are drifting in opposite directions. When PARALLEL SYSPLEX was first introduced in the early 1990s, the signalling time needed to communicate a message through the coupling facility was significantly greater than the maximum-possible clock offset, thus the synchronization requirements described above were met.
Since the introduction of the PARALLEL SYSPLEX, the speed of the processors upon which the coupling facility and its attaching systems run has steadily increased, as has the speed of the ISC and ICB communication links. However, the precision of the ETR has not changed, and a clock drift of several microseconds between re-synchronization is still possible. As the processor speeds continue to increase, the signaling time will eventually be less than the maximum-allowable clock offset and the requirements for clock synchronization will no longer be met by re-synchronizing the clocks once every second as described above.