1. Field of the Invention
This invention relates to computer systems and, more particularly, to integrated bus bridge designs for use in high performance computer systems. The invention also relates to memory coherency in computer systems and to bus bridge designs that support write posting operations.
2. Description of the Related Art
Computer architectures generally include a plurality of devices interconnected by one or more buses. For example, conventional computer systems typically include a CPU coupled through bridge logic to an external main memory. A main memory controller is thus typically incorporated within the bridge logic to generate various control signals for accessing the main memory. An interface to a high bandwidth local expansion bus, such as the Peripheral Component Interconnect (PCI) bus, may also be included as a portion of the bridge logic. Examples of devices which can be coupled to the local expansion bus include network interface cards, video accelerators, audio cards, SCSI adapters, telephony cards, etc. An older-style expansion bus may be supported through yet an additional bus interface to provide compatibility with earlier-version expansion bus adapters. Examples of such expansion buses include the Industry Standard Architecture (ISA) bus, also referred to as the AT bus, the Extended Industry Standard Architecture (EISA) bus, and the Microchannel Architecture (MCA) bus. Various devices may be coupled to this second expansion bus, including a fax/modem card, sound card, etc.
The bridge logic can link or interface more than simply the CPU bus, a peripheral bus such as a PCI bus, and the memory bus. In applications that are graphics intensive, a separate peripheral bus optimized for graphics related transfers may be supported by the bridge logic. A popular example of such a bus is the AGP (Advanced Graphics Port) bus. AGP is generally considered a high performance, component level interconnect optimized for three dimensional graphical display applications, and is based on a set of performance extensions or enhancements to PCI. AGP came about, in part, from the increasing demands placed on memory bandwidths for three dimensional renderings. AGP provided an order of magnitude bandwidth improvement for data transfers between a graphics accelerator and system memory. This allowed some of the three dimensional rendering data structures to be effectively shifted into main memory, relieving the costs of incorporating large amounts of memory local to the graphics accelerator or frame buffer.
AGP uses the PCI specification as an operational baseline, yet provides three significant performance extensions or enhancements to that specification. These extensions include a deeply pipelined read and write operation, demultiplexing of address and data on the AGP bus, and ac timing specifications for faster data transfer rates.
Since computer systems were originally developed for business applications including word processing and spreadsheets, among others, the bridge logic within such systems was generally optimized to provide the CPU with relatively good performance with respect to its access to main memory. The bridge logic generally provided relatively poor performance, however, with respect to main memory accesses by other devices residing on peripheral busses, and similarly provided relatively poor performance with respect to data transfers between the CPU and peripheral busses as well as between peripheral devices interconnected through the bridge logic.
Recently, however, computer systems have been increasingly utilized in the processing of various real time applications, including multimedia applications such as video and audio, telephony, and speech recognition. These systems require not only that the CPU have adequate access to the main memory, but also that devices residing on various peripheral busses such as an AGP bus and a PCI bus have fair access to the main memory. Furthermore, it is often important that transactions between the CPU, the AGP bus and the PCI bus be efficiently handled. The bus bridge logic for a modern computer system should accordingly include mechanisms to efficiently prioritize and arbitrate among the varying requests of devices seeking access to main memory and to other system components coupled through the bridge logic.
To support high performance, many bus bridge designs support write posting operations for write cycles initiated on one or more of the interfaced buses. Specifically, many bus bridge designs allow the bus bridge to receive and "post" a write cycle initiated upon the microprocessor bus or a peripheral bus, such as the PCI bus. Once the write data is received by the bus bridge, the cycle on the processor or peripheral bus can be completed, even though the write data has not yet actually been written into main memory or to a destination bus by the bus bridge. Once a write has been posted in the bus bridge, the bridge may complete the write to the destination at a later time in an efficient manner without stalling the initial write cycle presented on the processor or peripheral bus.
While write posting in bus bridges can greatly improve performance, problems relating to memory coherency can arise. To avoid coherency problems, various ordering rules may be established. For example, if a PCI device issues a request to read data from main memory such as a flag set by the microprocessor indicating that a data transfer from the microprocessor to the PCI bus has been completed, any posted data from the microprocessor to the PCI bus needs to be flushed to assure that the data transfer has actually completed. Similarly, a PCI device may write a block of data to memory, which is posted within the bus bridge. If the microprocessor issues a read request to read a flag from the PCI device to determine whether the data has been transferred to main memory, the posted PCI to memory transactions in the bridge should be flushed prior to initiating the read on the PCI bus. The flushing operations in the above scenarios assure that the device reading the flag does not operate upon data it expects to have already been transferred.
It is desirable to provide mechanisms within a bus bridge of a computer system to allow write posting operations while maintaining coherency.