1. Field of Use
This invention pertains to data processing systems and, more particularly, to apparatus for transmitting and receiving requests over a common bus.
2. Prior Art
There are a variety of methods and apparatuses for interconnecting the different unit controllers of a data processing system for transmitting and receiving requests over a common bus. The transfer of requests proceeds either over synchronous or asynchronous generated bus transfer cycles of operation. U.S. Pat. Nos. 3,676,860 and 3,866,181 are illustrative of such systems.
In some systems, it has been the practice to include itegrity bits in the data portion of a request. These bits are used to verify the correctness of the data following acceptance of the request by a receiving unit.
U.S. Pat. Nos. 3,993,981 and 4,371,928, assigned to the same assignee as named herein, are illustrative of an asynchronous bus system. These systems have units which are coupled in a priority network which is distributed along the system bus. Each unit has response apparatus for responding to a request for a transfer of information from another unit which provides up to three different types of signal responses. Also, each unit, except memory, has comparator circuits for insuring the integrity of the information being transferred over the bus. The master unit compares the channel number portion of each request sent by it to a slave unit during a previous bus cycle with the address channel number received back from the slave unit during a subsequent cycle of operation.
This arrangement only provides a subsequent check for insuring that information was transferred to the unit originating the request. It only indirectly verifies that a request was received by the correct unit. Further, the arrangement contemplates an operating environment in which the units attached to the system bus are not assigned similar channel number addresses and normally only a single memory request is being processed at any given interval of time. However, with the introduction of more efficient techniques of using memory, resulting in simultaneous processing of requests, and an increase in the number of units (e.g. memory controllers, I/O controllers and central processing units) attachable to the system bus, the chance for undetected errors has increased substantially.
The systems disclosed in U.S. Pat. Nos. 3,993,981 and 4,371,928 have provided some additional integrity in addressing a memory controller and its different memory board (i.e., modules). When the memory controller detects having received its address with correct parity and an indication that the module board being addressed has been installed in the system, the controller generates one of three specified responses. If any one of these conditions is not met, the controller does not respond. After a certain period of time, this will generate a time out condition to occur within the system, causing central processing unit to detect an interrupt or trap. Again, the integrity of the system is only insured to the point of correctly addressing the memory controller and preventing the acceptance of a memory request.
This still leaves open the possibility of having good memory data destroyed or incorrect data written into memory. Moreover, by the time the error is detected by the central processing unit, system operation will have progressed to a point where the actual source of the problem cannot be accurately determined. Thus, considerable system processing time has to be expended in processing such error conditions at the operating system software level without any realistic chance for success. The reason for this is that errors caused by the system bus and associated circuits have been observed to manifest themselves as intermittent conditions rather than as solid failures. That is, certain operating conditions often times create metastable, oscillatory or partial failure modes of operation within the different bistable devices which form a part of the system bus priority networks and control circuits. Also, a part or component in the process of failing will operate unreliably thus introducing intermittent errors. Further, unique conditions can arise, such as several units simultaneously requesting system bus access, which cause still another kind of intermittent error condition.
Thus, there is a definite need for a resilient bus arrangement. This is in contrast to trying to increase the reliability of a system bus through the introduction of redundant circuits or special hardware checking facilities.
Additionally, the resilient bus arrangement must be compatible with normal testing procedures. That is, frequently, such testing procedures involve introducing bad data into system units to verify their operation. While it is possible to place each system element in a special test mode, this can require additional hardware and software as well as added complexity. Further, this may not be possible in cases where the system is required to operate with a number of different units including units of older designs. When older design units are made attachable to a resilient bus arrangement, exception conditions can occur which are inconsistent with a given set of rules required for enforcing system integrity. In the case of a memory system, exception conditions would include situations in which the memory system contains bad data.
Since the exception conditions can vary with each unit, the implementation of each interface unit could differ substantially adding to system complexity. Moreover, this could affect overall system reliability and interfere with the consistent maintenance of system integrity. Accordingly, there is a need for a resilient bus arrangement which is compatible with normal testing procedures and a number of different units including units of older designs.
Accordingly, it is a primary object of the present invention to provide a system which is resilient to errors occurring during bus transfers made during both normal and test operations.
It is a further object of the present invention to provide a resilient system which prevents damage to the integrity of a system's data and operation notwithstanding the number of different units it contains.