In a system having multiple, interconnected nodes, credits are often used to proactively control the flow of transactions. In this approach, if two nodes A and B are connected, A maintains a certain number of credits for B in each flow-control class in which it can send a transaction to B. Each transaction sent from A to B enters a first-in, first-out (FIFO) structure belonging to the flow-control class for that transaction. Thus, A maintains the amount of available space for B in each flow-control class in which it can send a transaction to B. A flow-control class may consist of instructions of a particular type. For example, a class may consist of “read” requests in which the requesting node sends a memory address to the receiving node and the receiving node retrieves the data from memory and sends the data as a response to the requesting node.
When A sends a transaction to B in a particular flow-control class (e.g., class F), a credit system decrements the amount of available credits of class F for B by the amount reflecting the amount of FIFO storage that will be consumed in B for node A. When node B consumes the transaction, it releases those credits back to A. In the “read” example above, the requesting node may have 100 credits initially allotted for the “read” class, and a single read request might require 2 credits. Under existing credit systems, the available balance for the “read” class credits will decrease by two each time a request is made. The destination node will release the credits back to the sender as it removes the transaction from its queue. The released credits for each flow control class travel back to the sender A from receiver B either through a special transaction or as part of normal transactions that B sends to A. It should be noted that different transactions in the same flow control class may need different credits. The sender sends a transaction if it has enough credits needed for that transaction.
It may be noted that node A may have to keep track of credits for other nodes if node A is directly connected to those nodes. For example, in a multi-processor system, the nodes may be different interconnect chipsets such as the memory controller, the I/O controller, the cross-bar, etc. The various flow-control classes may be used for transactions such as to obtain a cache line, data return from memory, coherency responses from the various caches, etc. Some systems may not employ a proactive credit-based flow-control mechanism, but instead may adopt a separate signal from the destination node requesting the sender to stop transmitting any new transactions. The proposed scheme should work for the signaling way of flow control as well.
While the credit-based scheme, or the reactive flow-control scheme, guarantees that a transaction does not get dropped due to a lack of FIFO space in the destination node, they do not guarantee that access requests from one node do not slow down traffic in the entire system. Existing credit systems only consider the amount of space required by a transaction in the destination node and do not consider the amount of space required if the transaction needs a response. For example, a system may include a cross-bar chip connected to a cache coherent host I/O bridge. The cross-bar chip may also be connected to a memory controller, some CPU chips, other host I/O bridges, and connections to other cross-bars if a bigger system is used. The host I/O bridge may serve one or more I/O buses, each of which may contain one or more I/O devices. It is conceivable that if a lot of I/O devices are actively doing direct memory access (DMA) and the cache is pre-fetching deep for each DMA read request, many cache line read requests may go to the system. A DMA read request will enter the read request queue in the cross-bar chip. The read request will then be forwarded to the system memory controller or other caches, or some other system memory controller, if they hold the data. The requested entity subsequently provides the data to the cross-bar chip. This data enters the data return queue in the cross-bar and is eventually returned to the host I/O bridge that requested the data as a data return transaction in a different flow control class.
Typically, data return requires many more cycles than a read request. If requests from the host bridge arrive at a fast pace, the data returning to the host bridge will be queued in the cross-bar chip. This may create a situation in which the data return queue in the cross-bar chip is full due to multiple pending data returns to the host bridge. When this happens, the traffic to other parts of the system also stalls because the data return queue in the cross-bar is a shared resource. The problem arises because an entity such as the host I/O bridge can make requests at a rate at which it cannot sustain the resultant data return. Even though the host bridge may possess the credits to make the read requests, it does not have enough bandwidth to process the data returns at the rate at which it can make read requests. This causes a backlog of data return traffic in the system and slows down the entire system. Thus, a greedy agent can lower the entire system's performance. Existing strategies do not provide any safeguards against this kind of performance bottleneck. What is needed is a system that prevents a single requesting agent from slowing the system. In particular what is needed is a system that considers the effects of pending transactions on the system when it decides whether or not to initiate a new transaction.