A typical multiprocessing computer system comprises two or more central processing units (CPUs), a memory unit (main memory), and an input/output (I/O) unit. The main memory stores information, i.e., data and instructions for processing the data, in addressable storage locations. The information is transferred between the main memory and the CPUs along a bus which carries, inter alia, memory access request signals specifying the direction of transfer.
The information transferred between the CPUs and main memory must conform to certain timing relationships that exist between the memory access request signals and the information on the bus. If the CPUs operate at a fast clock signal frequency and the response time of the main memory is slow as compared to the frequency of the clock signal, the CPUs must enter wait states until the requests are completed, thereby affecting their processing rates. This is especially true for highly pipelined CPUs, such as those that are used in many reduced instruction set computers (RISCs).
A goal of RISC computer design is architectural simplicity and efficient pipelining to achieve a high-speed execution rate of instruction requests. Main memory is typically not fast enough to execute memory access requests as needed by the RISC CPUs. High-speed cache memories are used in these situations to compensate for the time differential between the memory access time and the CPU clocking frequencies. The access time of a cache is closer to the operational speed of the CPU and thus increases the speed of data processing by providing information to the CPU at a rapid rate.
A cache memory is typically organized into blocks, each of which is capable of storing a predetermined amount of information. Specifically, each block contains the data and instructions needed by the CPU, along with control flags indicating, inter alia, the status of those data and instructions. When a CPU requires information, the cache is initially examined. If the information is not found in the cache, the main memory is accessed. A block mode read request is then issued by the CPU to transfer a block of information including the required information from the main memory to the cache. The information is retrieved from main memory and loaded into a cache block, and the cache block is assigned a cache address. The cache address includes an "index" field and a "tag" field, which correspond to address fields of the location in main memory from which the information was retrieved.
Because a program may run on any CPU in a multiprocessor system, copies of the same information may reside in more than one place in the system. For example, copies of the same information may reside in each CPU's cache and in the main memory unit. If a CPU modifies a copy of that information without updating the remaining copies, those remaining copies of information become stale, thus creating a cache-coherency problem.
The prior art includes many solutions for the cache-coherency problem. These solutions are typically based upon the "states" of the control flags associated with the information contained in the cache. For example, the control flags may include a hit flag that indicates whether a "cache-hit" occured during a particular cache access, a valid flag that indicates whether the information contained in a cache block is valid, a dirty flag that indicates whether the valid cache block has been modified from the information stored in the corresponding location in main memory while in the cache, and a shared flag that indicates whether the valid information contained in the block is also stored in the cache of another CPU.
Cache control logic is typically used to examine the states of these control flags and perform appropriate operations in accordance with a particular cache-coherency protocol. During design verification stage of product development, this control logic must be tested to ensure that it is operating properly. A general-purpose CPU is typically programmed to test the control logic by directly examining the control flags and data in the cache under certain conditions and comparing their states with the results obtained by the control logic under the same conditions. However, the architectural simplicity of RISC CPUs generally preclude their direct interrogation of the cache control flags and data.
Therefore, it is among the objects of the invention to provide an arrangement by which CPUs of a multiprocessor RISC system can verify the operation of cache control logic by indirectly examining the states of their caches.
Another of object of the invention is to provide a method by which CPUs of a multiprocessor RISC system can efficiently detect invalid cache states in the system.