1. Field of the Invention
The present invention relates to a data processing system having a facility for handing words which represent the occurrence of events (hereinafter called "event word"), wherein an event word, as it occurs, is first placed in a queue and then reported to a processing destination, and more particularly to a data processing system having an event word handling facility whereby an event word of higher significance can be reported without fail.
2. Description of the Related Art
In a large general-purpose computer or a supercomputer to which are connected a large number of peripherals such as disk devices, MT devices, printers, etc., a channel subsystem, which includes a plurality of channel processors for controlling individual peripheral devices connected to respective channels and an I/O processor responsible for overall control of the plurality of channel processors, are provided in order to improve data transfer speeds to and from the peripheral devices.
In a computer system of the above configuration, when a certain failure occurs in a peripheral device, the occurrence of the failure event is notified to the I/O processor via the channel processor responsible for controlling that peripheral device. When alerted to the occurrence of the failure event, the I/O processor places an event word associated therewith into a queue buffer, which is also accessible from the main CPU, and at the same time, sends a machine check interrupt to the main CPU. The main CPU then handles the interrupt and performs the necessary processing by sequentially reading out the event words queued in the buffer.
Failures that may occur in peripheral devices can be classified into two types: a minor failure that can be recovered by a self-recovery process at the subsystem side; and a major failure from which such recovery is not possible. In either case, the failure occurrence is queued as an event word in the buffer. When alerted to the occurrence of a major failure, the main CPU issues, for example, a system reset command for an I/O interface for recovery by software. When the system reset by the I/O interface is complete, this event also is reported to the main CPU. If the completion of the system reset is not reported within a predetermined period, the main CPU performs the necessary processing, such as, isolating the failed peripheral device from the system.
In the above-described system, the number of event words that can be held in the buffer is finite, and when the buffer overflows, newly occurring event words will be discarded.
As described above, an event word is generated in the event of a failure occurring in a channel device or an input/output device or at the completion of a system reset by the I/O interface, for example. For a minor failure for which hardware recovery has been successfully carried out, the same kind of event word may recur, and no serious problem will be caused even if such an event word is discarded and no report made. On the other hand, in the case of a major failure for which hardware recovery has failed, or an event such as the completion of a system reset by the I/O interface, there is less possibility that the same kind of event word will recur; therefore, if reporting of such events cannot be done, there arises a problem that software recovery processing by the main CPU cannot be performed.
However, in the prior art, the system is such that, when the buffer becomes full of event words, any subsequent event word is discarded regardless of the kind of event word. The resulting problem is that when minor failures occur repeatedly and the buffer becomes full, any major failure occurring after that cannot be reported.