The sophistication of today's data processing systems permits events, such as dynamic microcode installation and the reassignment of storage to occur. However, such events require the stopping of any central processors affected by the events often resulting in outages or termination of software programs that are timed with the time of day clock or external time reference. In addition, in a coupled system environment, central processor stops can prevent the system from maintaining a heartbeat or freeing a critical global resource which could, in turn, result in another system taking a disruptive recovery action against the system experiencing the stop.
The advent of larger and larger machines and the implementation of new, more complex hardware functions in these machines has increased both the number and the duration of disruptive hardware actions, specifically central processor stops. In addition, software systems, such as operating systems, subsystems and applications executing on the machines have become more sensitive to the disruptions.
A stop of a central processor may cause a disruptive recovery action, such as termination of a unit of work, alternate central processing unit recovery (ACR), partitioning of a sysplex member which requires a subsequent re-IPL, etc. The interdependency between the software systems and a cross-system coupling facility environment makes this problem worse. Cross-system coupling facility requires constant communications between the separate operating system images running either on separate central processing complexes (CPCs) or in separate logical partitions in the logically partitioned mode on one central processing complex. A hardware event on one such central processor complex or hardware action in response to a request from one of the operating system images can adversely affect many or all of the other operating systems. There is presently no way for operating systems in such an environment to be notified that a disruptive hardware action is to occur, and there is no way for the affected operating systems to delay the hardware event until they are ready.
Presently, the hardware and software indicate changed states or capabilities only after the function causing the change has been completed, but not prior to initiating the change. Often such change will produce an unanticipated disruption to the processing on the system. If the operating system received advance notification that a disruptive event was to occur, the operating system could take action to minimize effects of that disruption.
There is no precedent of advance notification of a pending hardware initiated stop of the central processor. In one instance, when International Business Machines' Multiple Virtual Storage (MVS) System performs a storage reconfiguration, MVS knows that a stop will follow and, therefore, sets a bit for MVS recovery (excessive spin loop detection) and the problem is ignored. However, this is not an advance notification mechanism. In addition, operating systems that are running on other machines or in other logical partitions of the same machine are still unaware of the stop. For them, the central processor stop is still unexpected and disruptive.
Therefore, a need exists for an advance notification system in which all of the affected operating systems are notified prior to the performance of a hardware disruptive action, such as a central processor stop. Further, a need exists for the ability of the operating systems to indicate whether they are ready for such a stop or whether they need additional time before the stop may occur.