Embodiments of the present invention relate to a computer system having a multi-node computer architecture. In particular, the present invention relates to a method and apparatus for managing memory related requests in a multi-node architecture such that there is no starvation of requests from a subset of nodes and every request from a node gets a chance to eventually complete.
Computer systems may contain multiple processors that may work together to perform a task. For example, a computer system may contain four processors that may share system resources (e.g., input devices or memory devices) and may perform parallel processing. The processors may send messages to each other, may send messages to system resources, and may send and receive messages from the system resources. For example, such messages may include requests for information that is stored at a location in a memory device or a request to store information in a location of a memory device.
In many computer systems, the set of data currently being used by a microprocessor may be copied from a system memory device such as a dynamic random access memory (DRAM) into a relatively smaller but faster cache memory device such as a static random access memory (SRAM). The cache memory device is usually private to each processor such that only one processor can read or write to it. In such systems, a cache is said to be xe2x80x9ccoherentxe2x80x9d if the information resident in the cache reflects a consistent view of the information in all the private cache memory devices and the DRAM memory. Cache xe2x80x9csnoopingxe2x80x9d is a technique used to detect the state of a memory location in private cache memory devices on a memory access that might cause a cache coherency problem. In a multi-processor system, the messages sent between processors may include cache snooping messages.
A processor may be said to encounter a xe2x80x9clivelockxe2x80x9d or xe2x80x9cstarvationxe2x80x9d situation when a coherency event generated by a processor is unable to complete for an indefinite amount of time even after repeated attempts, because another event from the same processor or another processor prevents it from making forward progress. If the sending of requests from processors in a multi-processor system and servicing of the requests at the responding agent is not managed properly, then some of the requests may be starved by the responding agent and a livelock situation may occur. For example, a first processor may be accessing a memory location, and at the same time the second processor may be also accessing the same memory location. If the memory agent can satisfy only one request to a memory location at a time and must ask all other requests to the same location to be reissued, then it is possible that the request from the first processor never completes because the same memory location is being accessed again and again by the second processor. In this case, requests from the second processor cause starvation of the request from the first processor, therefore causing a livelock at the first processor.