Typical computer systems, such as desktop and notebook computers, contain a processor and numerous other integrated circuits. The processor, which is typically considered to be the "brains" of the computer system, needs to be in communication with the other integrated circuits (ICs) so that the processor can communicate information to and compute information received from the other ICs. Similarly, the ICs need to be in communication with one another so that they are able to work together more efficiently. The lines of communication that allow one IC to communicate with other ICs within a computer system are called buses.
A computer system may contain many different types of buses, each having its own protocol. When an IC coupled to a first bus is to communicate information with another IC coupled to a second bus, an intermediate IC, or group of ICs, is used to couple the first bus to the second bus, thereby allowing communication between ICs coupled to the first and second buses. This intermediate IC, or group of ICs, is called a bridge. One type of bus is called a peripheral component interconnect (PCI) bus and is described in the PCI Local Bus Specification. Revision 2.1 (1995). Another type of bus is called a memory bus. An IC that requests data from another IC is called a bus master, initiating agent, or just simply master.
A master, such as a graphics signal processor, coupled to a PCI bus, initiates a read operation by requesting data from another IC or group of ICs, such as main memory of the computer system coupled to a memory bus. A PCI to memory bus bridge is used to couple the PCI bus to the memory bus, thereby allowing communication between the master and memory. The requested data, along with additional data, is transferred from the memory to the memory bus, from the memory bus to the bridge, from the bridge to the PCI bus, and from the PCI bus to the master. Data can be sent as a burst of data phases during a single transaction via the PCI bus in response to a single request by the master.
The ability to burst data across the PCI bus in this manner is advantageous to increase the bandwidth of the bus. Bandwidth is data flux measured as the amount of data that is transferred across a bus in a fixed period of time (the period of time is typically measured in clock cycles). A data element is the amount of data that is sent during a single data phase, usually in a single clock cycle, and is typically equal to the width of the data bus portion of the bus (the number of bus signal lines that transfer data along the bus). For a PCI bus having a data bus that is 32 bits wide, a data element is 32 bits wide, or one Dword (4 bytes), and is transferred during a single data phase in a single clock cycle of the burst. A group of related data elements is called a packet. The relationship between data elements within a packet is typically the close proximity of address locations of the data within memory. For example, a cache line, or some portion thereof, represents a packet. A cache line that is 32 bytes long contains 8 data elements each 32 bits in length.
Once the memory transfers a first packet of data to the bridge via the memory bus, transferring the packet from the bridge to the master via the PCI bus is achieved by consecutively transferring data elements of the first packet from the bridge to the master during contiguous data phases (clocks) of a burst cycle in a single transaction. Transferring a second packet of data, however, from the memory to the master, may involve some latency (time delay, typically measured in clock cycles) between the last data element of the previous packet (the first packet) and the first data element of the subsequent packet (the second packet), even if the address location of the subsequent packet is consecutive to the previous packet within memory. The reason for the latency may be due to, for example, a delay in transferring the subsequent packet from the memory to the bridge via the memory bus because another IC, such as the processor, is busy using the memory bus to communicate with memory. Another reason for the latency may be due to the time it takes to snoop the cache in the processor to ensure that the data contained in the subsequent packet is valid (to ensure cache coherency).
According to PCI latency protocol rules, only eight clocks are allowed for "target subsequent latency" before the transaction must be terminated. This means that a data element must be sent, during a data phase, within eight clocks from the completion of sending the previous data element during the previous data phase, so the maximum latency limit for data transfers is seven clocks. Once seven clocks have elapsed since a first data element is sent from the bridge to the master, the bridge must either send another data element during the eighth clock, or else disconnect the transaction. The reason for setting a maximum latency limit for data transfers is to prevent a master from tying up a bus while waiting for the target (the bridge in this case) to provide requested data during which time other masters may be waiting to use the same bus to communicate information with a target.
If the latency involved in transferring the second packet (or at least the first data element thereof) from the memory to the bridge causes too many clocks to elapse on the PCI bus while the bridge waits for the second packet after transferring the last data element of the first packet to the master, then the transaction is terminated by disconnecting. Once disconnected, the bridge must purge its appropriate buffers and reset its state machine, and the master must re-arbitrate for bus ownership and again request the data. Each of these operations associated with the disconnect and re-initiating of a separate transaction takes time and reduces the PCI bus bandwidth by delaying the transfer of data to the master. If, instead, the second packet is provided from the memory to the bridge before eight PCI clocks elapse, then the bridge can send the data elements of the second packet to the master during the same transaction burst in which the data elements of the first packet were sent. Thus, the time delay overhead associated with disconnecting and re-initiating a separate transaction is eliminated, and the data transfer from the target to the master is completed in less time, thereby increasing bandwidth by reducing data element transfer latency.
One method of ensuring that the data elements of the first and second packets are transferred from the main memory to the master during a single burst transaction, without violating PCI latency rules, is to reduce the latency associated with transferring the second packet from the memory to the bridge via the memory bus. This is accomplished by dedicating the memory bus to this transfer, preventing other ICs, such as the processor, from accessing memory via the memory bus until the second packet is transferred. Unfortunately, this results in reduced performance of the computer system because the processor is forced to stall until it can access required data from memory. This stall can take a while, particularly if the processor must not only wait for the second packet to be transferred from the memory to the bridge but also wait for a third and fourth packet to be transferred as well during a long burst transaction.