This invention relates generally to computer systems and more particularly to computer systems having central processing units (CPUs) employing error correction during cache fill operations.
As is known in the art, computer systems generally include at least one central processing unit and a memory interconnected by a system bus. In a typical computer system implementation, instructions and data are stored in the same memory. The processor fetches instructions from the memory and executes operations on data as specified by the fetched instructions. As the speed of processors has increased, a need has arisen to find ways to more suitably match the access time of the main computer memory to the computational speed of the processor.
One known way of accomplishing this is through the use of cache memory. A cache memory comprises a relatively small, yet relatively fast memory device arranged in close physical proximity to a processor. The utilization of cache memories is based upon the principle of locality. It has been found, for example, that when a processor accesses a location in memory, there is a high probability that the processor will continue to access memory locations surrounding the accessed location for at least a certain period of time. Thus, a preselected data block of a large, relatively slow access time memory, such as a main memory module coupled to the processor via a bus, is fetched from the main memory and stored in the relatively fast access cache memory. Accordingly, as long as the processor continues to access data from the cache memory, the overall speed of operation of the processor is maintained at a level significantly higher than would be possible if the processor had to arbitrate for control of the bus and then perform a memory read or write operation, with the main memory module, for each data access. Since cache memory typically has a much faster access time than main memory, a CPU with a cache memory system spends much less time waiting for instructions and operands to be fetched and/or stored. In multi-processor computer systems, each CPU is typically provided with its own cache or cache system.
A cache memory contains a subset of the information stored in main memory and typically resides on the data path between the processing unit and the system bus. The system bus is used by the CPU to communicate with the main memory as well as other processors in a computer system. When a processor attempts to access a main memory location whose contents (data) have been copied to the cache, no access to main memory is required in order to provide the requested data to the CPU. The required data will be supplied from the cache as long as the data contained in the cache is valid. Since access to the cache is faster than access to main memory the processor can resume operations more quickly.
The high performance capabilities achieved in a RISC computer are further enhanced when a plurality of such RISC computers are arranged in a multiprocessor system utilizing cache memories. A multiprocessor system can comprise, e.g., a plurality of RISC computers, an I/O device and a main memory module or modules, all coupled to one another by a high performance bus. The RISC computers can be utilized to perform co-operative or parallel processing as well as multi-tasking among them for execution of several applications running simultaneously, to thereby achieve dramatically improved processing power. The capabilities of the system can be further enhanced by providing a cache memory at each one of the RISC computers in the system.
While the above described cached, multi-processor RISC computer system represents a state-of-the-art model for a high performance computer system, the art has yet to achieve an optimal level of performance efficiency. For example, bus read transactions may involve errors of various types including, e.g., data parity errors and hard errors. Often read transactions involving errors take longer to complete than error free read transactions. This frequently creates timing problems requiring the bus interface control flow for the error free transfer of data to be different from the control flow for transactions involving errors.
Further timing problems may arise during read type bus transactions if the bus interface, on the module initiating the read transaction, was to check the validity of the data involved in every data transfer. In such a case, there may not be sufficient time for the bus interface to complete the data transfer in time to be ready for a subsequent bus data transfer.
Known approaches used to solve the above timing problems include slowing down the next bus transfer following every processor initiated read type transaction, i.e. increase the amount of time before the bus interface must respond to the subsequent bus transaction by delaying the subsequent transaction. Such an approach requires that the bus protocol provide a mechanism to slow down the rate of bus transfers and results in a waste of bus bandwidth.
Another known approach has been to use software to control the error handling during read transactions. In such a scheme, the control flow logic of the bus interface need not handle read transactions involving errors any differently than transactions which do not involve errors since the error handling is done by the software and not the control flow logic. However, the use of software for error handling has several disadvantages, including the fact that it is relatively slow in detecting and responding to errors. Thus, when software is used for error handling, a substantial period of time may pass before an error is detected. During that time, in which the error goes undetected, performance of the computer system may become unpredictable as a result of the error's presence in the computer system. In a cached multiprocessor computer system, this may result in the loss of coherency throughout the computer system.
Yet another known approach is to have data which is returned from external memory or the backup cache flow through error checker/corrector logic en route to the execution unit as well as the primary and secondary caches. However this approach adds additional CPU cycles to the latency of the external fill procedure.