1. Technical Field
This invention relates generally to transactions, such as memory requests and their responses, and more particularly to the eviction of memory lines from a cache so that the memory lines to which the transactions relate can be inserted into the cache.
2. Description of the Prior Art
There are many different types of multi-processor computer systems. A symmetric multi-processor (SMP) system includes a number of processors that share a common memory. SMP systems provide scalability. As needs dictate, additional processors can be added. SMP systems usually range from two to 32 or more processors. One processor generally boots the system and loads the SMP operating system, which brings the other processors online. Without partitioning, there is only one instance of the operating system and one instance of the application in memory. The operating system uses the processors as a pool of processing resources, all executing simultaneously, where each processor either processes data or is in an idle loop waiting to perform a task. SMP systems increase in speed whenever processes can be overlapped.
A massively parallel processor (MPP) system can use thousands or more processors. MPP systems use a different programming paradigm than the more common SMP systems. In an MPP system, each processor contains its own memory and copy of the operating system and application. Each subsystem communicates with the others through a high-speed interconnect. To use an MPP system effectively, an information-processing problem should be breakable into pieces that can be solved simultaneously. For example, in scientific environments, certain simulations and mathematical problems can be split apart and each part processed at the same time.
A non-uniform memory access (NUMA) system is a multi-processing system in which memory is separated into distinct banks. NUMA systems are similar to SMP systems. In SMP systems, however, all processors access a common memory at the same speed. By comparison, in a NUMA system, memory on the same processor board, or in the same building block, as the processor is accessed faster than memory on other processor boards, or in other building blocks. That is, local memory is accessed faster than distant shared memory. NUMA systems generally scale better to higher numbers of processors than SMP systems.
Multi-processor systems usually include one or more coherency controllers to manage memory transactions from the various processors and input/output (I/O). The coherency controllers negotiate multiple read and write requests emanating from the processors or I/O, and also negotiate the responses back to these processors or I/O. Usually, a coherency controller includes a pipeline, in which transactions, such as requests and responses, are input, and actions that can be performed relative to the memory for which the controller is responsible are output. Transaction conversion is commonly performed in a single stage of a pipeline, such that transaction conversion to performable actions is performed in one step.
For the actions to actually be performed, the memory line to which the transaction relates preferably is retrieved from either local or remote memory and placed in a cache. If the cache is already full, then another memory line must usually first be evicted. This can delay the ultimate processing of the transaction, as the eviction process can take some time. That is, the entire process is serialized, where first a given memory line is evicted, then the memory line to which a transaction relates is cached, and finally the transaction is processed. The eviction process thus usually must be completed in its entirety before the new memory line is retrieved from memory for storage in the cache, which can delay transaction processing even further.
For these and other reasons, therefore, there is a need for the present invention.