1. Field of the Invention
The present invention relates to a method and system for prefetching data within a bridge system.
2. Description of the Related Art
The Peripheral Component Interconnect (PCI) bus is a high-performance expansion bus architecture that was designed to replace the traditional ISA (Industry Standard Architecture) bus. A processor bus master communicates with the PCI local bus and devices connected thereto via a PCI Bridge. This bridge provides a low latency path through which the processor may directly access PCI devices mapped anywhere in the memory or I/O address space. The bridge may optionally include such functions as data buffering/posting and PCI central functions such as arbitration. The architecture and operation of the PCI local bus is described in xe2x80x9cPCI Local Bus Specification,xe2x80x9d Revisions 2.0 (April, 1993) and Revision 2.1s, published by the PCI Special Interest Group, 5200 Elam Young Parkway, Hillsboro, Ore., which publication is incorporated herein by reference in its entirety.
A PCI to PCI bridge provides a connection path between two independent PCI local busses. The primary function of the bridge is to allow transactions between a master on one PCI bus and a target device on another PCI bus. The PCI Special Interest Group has published a specification on the architecture of a PCI to PCI bridge in xe2x80x9cPCI to PCI Bridge Architecture Specification,xe2x80x9d Revision 1.0 (Apr. 10, 1994), which publication is incorporated herein by reference in its entirety. This specification defines the following terms and definitions:
initiating busxe2x80x94the master of a transaction that crosses a PCI to PCI bridge is said to reside on the initiating bus.
target busxe2x80x94the target of a transaction that crosses a PCI to PCI bridge is said to reside on the target bus.
primary interfacexe2x80x94the PCI interface of the PCI to PCI bridge that is connected to the PCI bus closest to the CPU is referred to as the primary PCI interface.
secondary interfacexe2x80x94the PCI interface of the PCI to PCI bridge that is connected to the PCI bus farthest from the CPU is referred to as the secondary PCI interface.
downstreamxe2x80x94transactions that are forwarded from the primary interface to the secondary interface of a PCI to PCI bridge are said to be flowing downstream.
upstreamxe2x80x94transactions forwarded from the secondary interface to the primary interface of a PCI to PCI bridge are said to be flowing upstream.
The basic transfer mechanism on a PCI bus is a burst. A burst is comprised of an address phase and one or more data phases. When a master or agent initiates a transaction, each potential bridge xe2x80x9csnoopsxe2x80x9d or reads the address of the requested transaction to determine if the address is within the range of addresses handled by the bridge. If the bridge determines that the requested transaction is within the bridge""s address range, then the bridge asserts a DEVSEL# on the bus to claim access to the transaction.
There are two types of write transactions, posted and non-posted. Posting means that the write transaction is captured by an intermediate agent, such as a PCI bridge, so that the transaction completes at the originating agent before it completes at the intended destination, e.g., the data is written to the target device. This allows the originating agent to proceed with the next transaction while the requested transaction is working its way to the ultimate destination. Thus, the master bus initiating a write operation may proceed to another transaction before the written data reaches the target recipient. Non-posted transactions reach their ultimate destination before completing at the originating device. With non-posted transactions, the master cannot proceed with other work until the transaction has completed at the ultimate destination.
All transactions that must complete on the destination bus, i.e., secondary bus, before completing on the primary bus may be completed as delayed transactions. With a delayed transaction, the master generates a transaction on the primary bus, which the bridge decodes. The bridge then ascertains the information needed to complete the request and terminates the request with a retry command back to the master. After receiving the retry, the master reissues the request until it completes. The bridge then completes the delayed read or write request at the target device, receives a delayed completion status from the target device, and returns the delayed completion status to the master that the request was completed. A PCI to PCI bridge may handle multiple delayed transactions.
With a delayed read request, the read request from the master is posted into a delayed transaction queue in the PCI to PCI bridge. The bridge uses the request to perform a read transaction on the target PCI bus and places the read data in its read data queue. When the master retries the operation, the PCI to PCI bridge satisfies the request for read data with data from its read data queue.
With a delayed write request, the PCI to PCI bridge captures both the address and the first word of data from the bus and terminates the request with a retry. The bridge then uses this information to write the word to the target on the target bus. After the write to the target has been completed when the master retries the write, the bridge will signal that it accepts the data with TRDY# thereby notifying the master that the write has completed.
The PCI specification provides that a certain ordering of operations must be preserved on bridges that handle multiple operations to prevent deadlock. These rules are on a per agent basis. Thus, for a particular agent communicating on a bus and across a PCI bridge, the agent""s reads should not pass their writes and a later posted write should not pass an earlier write. However, with current bridge architecture, only a single agent can communicate through the PCI bridge architecture at a time. If the PCI bridge is handling a delayed request operation and a request from another agent is attempted, then the PCI bridge will terminate the subsequent transaction from another agent with a retry command. Thus, a write operation from one agent that is delayed may delay read and write operations from other agents that communicate on the same bus and PCI bridge. Such delays are referred to as latency problems as one agent can delay the processing of transactions from other agents until the agent currently controlling the bus completes its operations. Further, with a delayed read request, a delayed read request from one agent must be completed before other agents can assert their delayed read requests.
Current systems attempt to achieve a balance between the desire for low latency between agents and high throughput for any given agent. High throughput is achieved by allowing longer burst transfers, i.e., the time an agent or master is on the bus. However, increasing burst transfers to improve throughput also increases latency because other agents must wait for the agent currently using the longer bursting to complete. Current systems employ a latency timer which is a clock that limits the amount of time any one agent can function as a master and control access to the bus. After the latency time expires, the master may be required to terminate its operation on the bus to allow another master agent to assert its transaction on the bus. In other words, the latency timer represents a minimum number of clocks guaranteed to the master. Although such a latency timer places an upper bound on latency, the timer may prematurely terminate a master""s tenure on the bus before the transaction terminates, thereby providing an upper bound on throughput.
One current method for reducing latency is the prefetch operation. Prefetch refers to the situation where a PCI bridge reads data from a target device in anticipation that the master agent will need the data. Prefetching reduces the latency of a burst read transaction because the bridge returns the data before the master actually requests the data, thereby reducing the time the master agent controls access to the bus to complete its requested operation. A prefetchable read transaction may be comprised of multiple prefetchable transactions. A prefetchable transaction will occur if the read request is a memory read within the prefetchable space, a memory read line, and memory read multiple. The amount of data prefetched depends on the type of transaction and the amount of free buffer space to buffer prefetched data.
Disconnect refers to a termination requested with or after data was transferred on the initial data phase when the target is unable to respond within the target subsequent latency requirement and, therefore, is temporarily unable to continue bursting. A disconnect may occur because the burst crosses a resource boundary or a resource conflict occurs. Disconnect differs from retry in that retry is always on the initial data phase, and no data transfers. Disconnect may also occur on the initial data phase because the target is not capable of doing a burst. In current PCI art, if a read is disconnected and another agent issues an intervening read request, then any prefetched data maintained in the PCI buffer for the disconnected agent is discarded. Thus, when the read disconnected agent retries the read request, the PCI bridge will have to again prefetch the data because any prefetched data that was not previously returned to the agent prior to the disconnect would have been discarded as a result of the intervening read request from another agent.
There is thus a need in the art for an improved bridge architecture to handle read/write transactions across a bridge from multiple agents.
Provided is an improved bridge system and method for prefetching data to return to a read request from an agent. The bridge system includes at least one memory device including a counter indicating a number of prefetch operations to perform to prefetch all the requested data, a first buffer capable of storing prefetch requests, and a second buffer capable of storing read data. Control logic implemented in the bridge system includes means for queuing at least one prefetch operation in the first buffer while the counter is greater than zero. The control logic then executes a queued prefetch operation, subsequently receives the prefetched data, and stores the prefetched data in the second buffer. The stored prefetched data is returned to the requesting agent.
In further embodiments, the requested data may be maintained in multiple tracks. In such case, the memory device includes a counter for each of the plurality of tracks including the requested data. Each counter indicates a number of prefetch operations to perform to prefetch all the requested data in the track corresponding to the counter. The control logic prefetches the data for each counter and stores the prefetched data in the second buffer.
In yet further embodiments, the bridge system prefetches data for multiple read requests from multiple agents. A counter is provided for each requesting agent and prefetch operations for the read requests from the multiple requesting agents are queued in separate queues. In this way, prefetch operations may be simultaneously queued for different requesting agents.
In still further embodiments, the bridge system prefetches data for multiple read requests from a single agent. A counter is provided for each of the multiple read requests from the single agent.
Preferred embodiments allow a bridge to prefetch all the data to service a read request in advance of returning the data by providing a counter indicating the number of prefetch operations that need to be performed. Latency times may be reduced because the bridge will return prefetched data from a buffer. In this way, requesting agents do not have to wait for the bridge to retrieve the data from the target device as requested data is returned from the buffer.
Moreover, with preferred embodiments, counters and read request queues may be provided for each requesting agent. With these structures, the bridge system may concurrently queue prefetch operations from different requesting agents to prefetch all the data for the read requests from different agents. The bridge may execute the queued prefetch operations for different agents using time division multiplexing to prefetch data across all requesting agents. This insures that the bridge system will have prefetched data to service agent read requests being time division multiplexed on the bus, thereby reducing latency across multiple requesting agents.