1. Field of the Invention
The present invention generally relates to computer systems having multiprocessor architectures and, more particularly, to a novel multi-processor computer system for processing memory accesses requests and enabling the fair sharing of limited resources in a computer system, particularly, multiprocessor systems, by a novel arbitration scheme.
2. Description of the Prior Art
To achieve high performance computing, multiple individual processors have been interconnected to form multiprocessor computer systems capable of parallel processing. Multiple processors can be placed on a single chip, or several chips—each containing one or several processors—interconnected into a multiprocessor computer system.
Processors in a multiprocessor computer system use private cache memories because of their short access time (a cache is local to a processor and provides fast access to data) and to reduce the number of memory requests to the main memory. However, managing caches in a multiprocessor system is complex. Multiple private caches introduce the multi-cache coherency problem (or stale data problem) due to multiple copies of main memory data that can concurrently exist in the multiprocessor system.
Small scale shared memory multiprocessing systems have processors (or groups thereof) interconnected by a single bus. However, with the increasing speed of processors, the feasible number of processors that can share the bus effectively decreases.
The protocols that maintain the coherence between multiple processors are called cache coherence protocols. Cache coherence protocols track any sharing of data blocks between the processors. Depending upon how data sharing is tracked, cache coherence protocols can be grouped into two classes: directory based and snooping.
In a multiprocessor system with coherent cache memory, consistency is maintained by a coherence protocol that generally relies on coherence events sent between caches. A common hardware coherence protocol is based on invalidations. In this protocol, any number of caches can include a read-only line, but these copies must be destroyed when any processor stores to the line. To do this, the cache corresponding to the storing processor sends invalidations to all the other caches before storing the new data into the line. If the caches are write-through, then the store also goes to main memory where all caches can see the new data. Otherwise, a more complicated protocol is required when some other cache reads the line with the new data.
In a cache-coherent multiprocessor system, there may be bursts of activity that cause coherence actions, such as invalidations, to arrive at a cache faster than the cache can process them. In this case, they are generally stored in first-in, first-out (FIFO) queues, thereby absorbing the burst of activity. As known, FIFO queues are a very common structure used in computer systems. They are used to store information that must wait, commonly because the destination of the information is busy. For example, requests to utilize a shared resource often wait in FIFO queues until the resource becomes available. Another example is packet-switched networks, where packets often wait in FIFO queues until a link they need becomes available.
It is known in the art that FIFO queues enable fair sharing of limited resources in a computer system. An arbiter, a common feature of computer systems, acts as a gatekeeper, arbitrating among multiple requestors to determine which is granted access to a desired resource and in what order. In general, this is necessary when a resource cannot be shared but multiple requesters desire to use it simultaneously.
Arbitration can be as simple as granting each requestor access in order, or it can be based on some priority criteria applied to the requesters. For example, most processors can only service a single hardware interrupt at a time, so a complex arbiter is used to determine the order with which multiple interrupt requests are presented to the processor. Interrupt requests are commonly assigned priority based on how urgently they need to be serviced. For example, dynamic memory refresh is far more important than the completion of a hardware DMA operation.
Arbiters that take priority into account must also deal with the issue of starvation. This is when a low-priority request is constantly passed over in favor of higher-priority requests, causing it to be denied service (or “starve”). In some cases, this is acceptable while in other cases the arbiter must insure that low priority requests eventually get serviced. A common technique for accomplishing this is to increase the priority of requests the longer they wait for service so that they will eventually become high-priority requests.
Most arbiters grant a single request each arbitration cycle. In some cases, it would be desirable to grant more than one. For example, the resource being shared has the capacity to handle more than one request per arbitration cycle, either because of resource availability or because the resource operates faster than the arbiter. Thus, what is needed is an arbiter implementing an arbitration methodology that can grant multiple requests in a single arbitration fairly and efficiently.
What is further needed is an arbitration mechanism for sharing a snoop, or invalidation, port of a cache between multiple queues of invalidation requests broadcast from remote processors in a coherent multiprocessor system.