As is known, many modern computing systems employ a multi-agent architecture. A typical system is shown in FIG. 1. There, a plurality of agents 110–160 communicates over an external bus 170 according to a predetermined bus protocol. “Agents” may include general-purpose processors 110–140, memory controllers 150, interface chipsets 160, input output devices and/or other integrated circuits that process data requests (not shown). The bus 170 may permit several external bus transactions to be in progress at once.
An agent (e.g., 110) typically includes a transaction management system that receives requests from other components of the agent and processes external bus transactions to implement the requests. A bus sequencing unit 200 (“BSU”), shown in FIG. 2, is an example of one such transaction management system. The BSU 200 may include an arbiter 210, an internal cache 220, an internal transaction queue 230, an external transaction queue 240, an external bus controller 250 and a prefetch queue 260. The BSU 200 manages transactions on the external bus 170 in response to data requests issued by, for example, an agent core (not shown in FIG. 2).
The arbiter 210 may receive data requests not only from the core but also from a variety of other sources such as the prefetch queue 260. Of the possibly several data requests received simultaneously by the arbiter 210, the arbiter 210 may select and output one of them to the remainder of the BSU 200.
The internal cache 220 may store data in several cache entries. It may possess logic responsive to a data request to determine whether the cache 220 stores a valid copy of requested data. “Data,” as used herein, may refer to instruction data and variable data that may be used by the agent. The internal cache 220 may furnish requested data in response to data requests.
The internal transaction queue 230 also may receive and store data requests issued by the arbiter 210. For read requests, it coordinates with the internal cache 220 to determine if the requested data “hits” (may be furnished by) the internal cache 220. If not, if a data request “misses” the internal cache 220, the internal transaction queue 230 forwards the data request to the external transaction queue 240.
The external transaction queue 240 may interpret data requests and generate external bus transactions to fulfill them. The external transaction queue 240 may be populated by several queue registers. It manages the agent's transactions as they progress on the external bus 170. For example, when data is available in response to a transaction, the external transaction queue 240 retrieves the data and forwards it to a requester within the agent (for example, the core).
The prefetch queue 260 may identify predetermined patterns in read requests issued by the core (not shown). For example, if the core issues read requests directed to sequentially advancing memory locations (addresses A, A+1, A+2, A+3, . . . ) the prefetch queue 260 may issue a prefetch request to read data from a next address in the sequence (A+4) before the core actually requests the data itself. By anticipating a need for data, the prefetch queue 260 may cause the data to be available in the internal cache 220 when the core issues a request for the data. The data would be furnished to the core from the internal cache 220 rather than from external memory—a much faster operation. Herein, this type of prefetch request is called a “patterned prefetch.”
A BSU 200 may implement a second type of prefetch, herein called a “blind prefetch.” When a core issues a read request to data at an address (say, address B) that will be fulfilled by an external bus transaction, a blind prefetch mechanism may cause a second external bus transaction to retrieve data at a second memory address (B+1). A blind prefetch may cause every read request from a core that cannot be fulfilled internally to spawn a pair of external bus transactions. Blind prefetches may improve processor performance by retrieving twice as many cache lines (or cache sectors) as are necessary to satisfy the core read request. Again, if the core eventually requires data from the data prefetched from the other address (B+1), the data may be available in the internal cache 220 when the core issues a read request for the data. A blind prefetch request also may be generated from a patterned prefetch request. Using the example above, a patterned prefetch request to address A+4 may be augmented by a blind prefetch to address A+5.
Returning to FIG. 1, it is well known that, particularly in multiprocessor computer systems, the external bus 170 can limit system performance. The external bus 170 often operates at clock speeds that are much slower than the internal clock speeds of the agents. A core often may issue several requests for data in the time that the external bus 170 can complete a single external bus transaction. Thus, a single agent can consume much of the bandwidth of an external bus 170. When a plural number of agents must share the external bus 170, each agent is allocated only a fraction of the bandwidth available on the bus 170. In multiple agent systems, agents very often must wait idle while an external bus retrieves data that they need to make forward progress.
An external transaction queue 240 (FIG. 2) may include control logic that prioritizes pending requests for posting to the external bus. Generally, core reads should be prioritized over prefetch reads and prefetch reads should be prioritized over writes. Core read requests identify data for which the core has an immediate need. Prefetch read requests identify data that the core is likely to need at some point in the future. Write requests identify data that the agent is returning to system storage. Accordingly, the external transaction queue 240 may include control logic that posts requests on the external bus according to this priority.
The predetermined priority scheme has its disadvantages. A request typically is stored in the transaction queue 240 until it is completed on the external bus. During periods of high congestion, when the transaction queue 240 is entirely or nearly full, prefetch and write requests may prevent new core requests from being stored in the queue 240. These lower priority requests would remain stored in the queue until an external bus transaction for the request completes. Thus, the lower priority requests may prevent higher priority requests from being implemented. This would limit system performance.
Accordingly, there is a need in the art for a congestion management system for an external transaction queue in an agent. There is a need in the art for such a system that provides a dynamic priority system—maintaining a first priority scheme in the absence of system congestion but implementing a second priority when congestion events occur.