Modern computer systems utilize various technologies and architectural features to achieve high performance operation. These technologies and architectural features include reduced instruction set computers, high speed cache memories and multiprocessor systems. Innovative arrangements of high performance components embodying one or more of the above can often result in significant improvements in the capabilities and processing power of a computer system.
A reduced instruction set computer (RISC technology) represents a "back to basics" approach to semiconductor chip design. An instruction set comprises a set of basic commands for fundamental computer operations, such as the addition of two data values to obtain a result. The instructions of an instruction set are typically embedded or hard wired into the circuitry of the chip embodying the central processing unit of the computer, and the various statements and commands of an application program running on the computer are each decoded into a relevant instruction or set of instructions of the instruction set for execution.
LOAD, ADD and STORE are examples of basic instructions that can be included in a computer's instruction set. Such instructions may be used to control, for example, the movement of data from memory to general purpose registers, addition of the data in the registers by the arithmetic and logic unit of the central processing unit, and return of the result to the memory for storing. In recent years, with significant advances in the miniaturization of silicon chips, chip designers began to etch more and more circuits into the chip circuitry so that instruction sets grew to include hundreds of instructions capable of executing, via hard wired circuitry, sophisticated and complex mathematical and logical operations.
A problem with the proliferation of instructions included in an instruction set is that the increasing complexity of the circuitry required to implement a large number of instructions resulted in a slow down in the processing speed of the computer. Moreover, it was determined that a relatively small percentage of the instructions of the instruction set were performing a large percentage of the processing tasks of the computer. Thus, many of the instructions have become "expensive" options, whose relatively infrequent use does not make up for the slow down caused by large instruction sets.
The objective of a RISC design is to identify the most frequently used instructions of the instruction set and delete the remaining instructions from the set. A chip can then be implemented with a reduced, but optimal number of instructions to simplify the circuitry of the chip for increased speed of execution for each instruction. While a complex operation previously performed by a single instruction may now have to be executed via several more basic instructions, each of those basic instructions can be executed at a higher speed than was possible before reduction of the instruction set. More significantly, when the instructions retained in the instruction set are carefully selected from among those instructions performing the bulk of the processing within the computer, the RISC system will achieve a significant increase in its overall speed of operation since that entire bulk of processing will be performed at increased speed.
By way of example, in some "large" instruction set systems, twenty percent of the instructions were performing eighty percent of the processing work. Thus a RISC system comprising the twenty percent of the instructions would achieve significantly higher speeds of operation during the performance of eighty percent of the workload.
The high performance capabilities achieved in a RISC computer are further enhanced when a plurality of such RISC computers is arranged in a multiprocessor system utilizing cache memories. A multiprocessor system can comprise, e.g., a plurality of RISC computers, an I/O device and a main memory module or modules, all coupled to one another by a high performance backplane bus. The RISC computers can be utilized to perform co-operative or parallel processing as well as multi-tasking among them for execution of several applications running simultaneously, to thereby achieve dramatically improved processing power. The capabilities of the system can be further enhanced by providing a cache memory at each one of the RISC computers in the system.
A cache memory comprises a relatively small, yet relatively fast memory device arranged in close physical proximity to a processor. The utilization of cache memories is based upon the principle of locality. It has been found, for example, that when a processor accesses a location in memory, there is a high probability that the processor will continue to access memory locations surrounding the accessed location for at least a certain period of time. Thus, a preselected data block of a large, relatively slow access time memory, such as a main memory module coupled to the processor via a bus, is fetched from the main memory and stored in the relatively fast access cache memory. Accordingly, as long as the processor continues to access data from the cache memory, the overall speed of operation of the processor is maintained at a level significantly higher than would be possible if the processor had to arbitrate for control of the bus and then perform a memory read or write operation, with the main memory module, for each data access.
While the above described cached, multi-processor RISC computer system represents a state-of-the-art model for a high performance computer system, the art has yet to achieve an optimal level of performance efficiency.
For example, bus read transactions may involve errors of various types including, e.g., data parity errors and hard errors. Often read transactions involving errors take longer to complete than error free read transactions. This frequently creates timing problems requiring the bus interface control flow for the error free transfer of data to be different from the control flow for transactions involving errors.
Further timing problems may arise during read type bus transactions if the bus interface, on the module initiating the read transaction, was to check the validity of the data involved in every data transfer. In such a case, there may not be sufficient time for the bus interface to complete the data transfer in time to be ready for a subsequent bus data transfer.
Known approaches used to solve the above timing problems, have been to slow down the next bus transfer following every processor initiated read type transaction, e.g. increase the amount of time before the bus interface must respond to the subsequent bus transaction by delaying the subsequent transaction. Such an approach requires that the bus protocol provide a mechanism to slow down the rate of bus transfers and results in a waste of bus bandwidth.
Another known approach has been to use software to control the error handling during read transactions. In such a scheme, the control flow logic of the bus interface need not handle read transactions involving errors any differently than transactions which do not involve errors since the error handling is done by the software and not the control flow logic.
However, the use of software for error handling has several disadvantages, including the fact that it is relatively slow in detecting and responding to errors. Thus, when software is used for error handling, a substantial period of time may pass before an error is detected. During that time, in which the error goes undetected, performance of the computer system may become unpredictable as a result of the errors presence in the computer system. In a cached multiprocessor computer system, this may result in the loss of coherency throughout the computer system.