1. Field of the Invention
This invention relates to a computer and, more particularly, to a bus interface unit which concurrently dispatches memory and input/output ("I/O") request cycles to respective target devices and maintains proper ordering of data sent to and returned from the memory and I/O target devices.
2. Description of the Related Art
Modern computers are called upon to execute instructions and transfer data at increasingly higher rates. Many computers employ CPUs which operate at clocking rates exceeding several hundred MHz, and further have multiple buses connected between the CPUs and numerous input/output devices. The buses may have dissimilar protocols depending on which devices they link. For example, a CPU local bus connected directly to the CPU preferably transfers data at a faster rate than a peripheral bus connected to slower input/output devices. A mezzanine bus may be used to connect devices arranged between the CPU local bus and the peripheral bus. The peripheral bus can be classified as, for example, an industry standard architecture ("ISA") bus, an enhanced ISA ("EISA") bus or a microchannel bus. The mezzanine bus can be classified as, for example, a peripheral component interconnect ("PCI") bus to which higher speed input/output devices can be connected.
Coupled between the various buses are bus interface units. According to somewhat known terminology, the bus interface unit coupled between the CPU bus and the PCI bus is often termed the "north bridge". Similarly, the bus interface unit between the PCI bus and the peripheral bus is often termed the "south bridge".
The north bridge, henceforth termed a bus interface unit, serves to link specific buses within the hierarchical bus architecture. Preferably, the bus interface unit couples data, address and control signals forwarded between the CPU local bus, the PCI bus and the memory bus. Accordingly, the bus interface unit may include various buffers and/or controllers situated at the interface of each bus linked by the interface unit. In addition, the bus interface unit may receive data from a dedicated graphics bus, and therefore may include an advanced graphics port ("AGP"). As a host device, the bus interface unit may be called upon to support both the PCI portion of the AGP (or graphics-dedicated transfers associated with PCI, henceforth is referred to as a graphics controller interface, or "GCI"), as well as AGP extensions to the PCI protocol.
There are numerous tasks performed by the bus interface unit. For example, the bus interface unit must orchestrate timing differences between a faster CPU (processor) local bus and a slower mezzanine bus, such as a PCI bus or a graphics-dedicated bus (e.g., an AGP bus). In addition, the bus interface unit may be called upon to maintain time-sensitive relationships established within the pipelined architecture of a processor bus. If data attributable to a request forwarded across the processor bus is dependent on data of a previous request, then the timing relationship between those requests must be maintained. In other words, timing of requests which occur during a request phase of the pipeline must be maintained when data is transferred during a later, data transfer phase of the pipeline in order to ensure coherency of the pipelined information.
A stalling mechanism is sometimes employed to account for timing differences between a slower peripheral bus and a faster processor or memory bus. Stall cycles can therefore occur within a particular phase of the processor bus pipeline, and particularly in the snoop phase. Modern processor buses, such as the a Pentium.RTM. Pro bus employes numerous phases: arbitration, request, error, snoop, response, and data transfer.
Stalling, however, does not by itself draw one transaction ahead of another in the pipeline of the processor bus. A deferral mechanism is therefore used for the purpose of allowing a more critical transaction to proceed to completion through the various phases ahead of an earlier-placed transaction (i.e., a transaction placed into the pipeline ahead of the more critical transaction). The transaction being deferred is therefore said to be set aside in favor of a transaction which needs to be serviced quickly.
For example, in an attempt to immediately service requests to faster local memory (i.e., system memory of substantially contiguous semiconductor memory space), modern processor bus architecture allow memory request cycles to be completed upon the processor bus ahead of cycles to the peripheral bus. This means that peripheral-destined cycles which may be snoop stalled are deferred to allow faster, memory-destined cycles to be drawn from the in-order queue of the pipeline ahead of the slower, deferred peripheral-destined cycles. The deferred cycle must, however, be re-initiated at a later time beginning at the first phase (i.e., arbitration phase) of the processor pipeline. Many clock cycles are then needed to again place the deferred transaction back into the snoop phase. Associated with each deferral, a processor bus clocking penalty must be paid for each deferral operation.
An advantage arises if the number of snoop stall cycles and deferred cycles can be minimized. A bus interface unit which can possibly forward memory request cycles without having to snoop stall immediately preceding peripheral request cycles would be a significant improvement to the conventional snoop stall routine. The benefit of dispatching memory requests as soon as possible, and dispatching peripheral requests whenever the peripheral bus or peripheral data is available, proves advantageous as a tool for optimizing the processor bus bandwidth and memory accesses. A bus interface unit which can minimize snoop stall without necessarily having to pay the burdensome penalty of cycle deferral would pose an important advancement over conventional bus interface unit architecture.