This invention relates to computer systems, and more particularly to a memory access protocol for a computer system bus which uses a bridge between a processor bus and a standardized system bus.
Computer systems of the PC type usually employ a so-called expansion bus to handle various data transfers and transactions related to I/O and disk access. The expansion bus is separate from the system bus or from the bus to which the processor is connected, but is coupled to the system bus by a bridge circuit.
For some time, all PC""s employed the ISA (Industry Standard Architecture) expansion bus, which was an 8-MHz, 16-bit device (actually clocked at 8.33 MHz). Using two cycles of the bus clock to complete a transfer, the theoretical maximum transfer rate was 8.33 MBytes/sec. Next, the EISA (Extension to ISA) bus was widely used, this being a 32-bit bus clocked at 8MHz, allowing burst transfers at one per clock cycle, so the theoretical maximum was increased to 33-MBytes/sec. As performance requirements increased, with faster processors and memory, and increased video bandwidth needs, a high performance bus standard was a necessity. Several standards were proposed, including a Micro Channel architecture which was a 10-MHz, 32-bit bus, allowing 40-MByte/sec, as well as an enhanced Micro Channel using a 64-bit data width and 64-bit data streaming, theoretically permitting 80-to-160 MByte/sec transfer. The requirements imposed by use of video and graphics transfer on networks, however, necessitate even faster transfer rates. One approach was the VESA (Video Electronics Standards Association) bus which was a 33 MHz, 32-bit local bus standard specifically for a 486 processor, providing a theoretical maximum transfer rate of 132-MByte/sec for burst, or 66-MByte/sec for non-burst; the 486 had limited burst transfer capability. The VESA bus was a short-term solution as higher-performance processors, e.g., the Intel P5 and P6 or Pentium and Pentium Pro processors, became the standard.
The PCI (Peripheral Component Interconnect) bus was proposed by Intel as a longer-term solution to the expansion bus standard, particularly to address the burst transfer issue. The original PCI bus standard has been upgraded several times, with the current standard being Revision 2.1, available from a trade association group referred to as PCI Special Interest Group, P.O. Box 14070, Portland, Oreg. 97214. The PCI Specification, Rev. 2.1, is incorporated herein by reference. Construction of computer systems using the PCI bus, and the PCI bus itself, are described in many publications, including xe2x80x9cPCI System Architecture,xe2x80x9d 3rd Ed., by Shanley et al, published by Addison-Wesley Pub. Co., also incorporated herein by reference. The PCI bus provides for 32-bit or 64-bit transfers at 33-or 66-MHz; it can be populated with adapters requiring fast access to each other and/or with system memory, and that can be accessed by the host processor at speeds approaching that of the processor""s native bus speed. A 64-bit, 66-MHz PCI bus has a theoretical maximum transfer rate of 528-MByte/sec. All read and write transfers over the bus can be burst transfers. The length of the burst can be negotiated between initiator and target devices, and can be any length.
System and component manufacturers have implemented PCI bus interfaces in various ways. For example, Intel Corporation manufactures and sells a PCI Bridge device under the part number 82450GX, which is a single-chip host-to-PCI bridge, allowing CPU-to-PCI and PCI-to-CPU transactions, and permitting up to four P6 processors and two PCI bridges to be operated on a system bus. Another example is offered by VLSI Technology, Inc., is a PCI chipset under the part number VL82C59x SuperCore, providing logic for designing a Pentium based system that uses both PCI and ISA buses. The chipset includes a bridge between the host bus and the PCI bus, a bridge between the PCI bus and the ISA bus, an a PCI bus arbiter. Posted memory write buffers are provided in both bridges, and provision is made for Pentium""s pipelined bus cycles and burst transactions.
The PENTIUM PRO processor, commercially available from Intel Corporation, uses a processor bus structure as defined in the specification for this device, particularly as set forth in the publication xe2x80x9cPentium Pro Family Developer""s Manualxe2x80x9d Vols. 1-3, Intel Corp., 1996, available from McGraw-Hill, and incorporated herein by reference; this manual is also available from Intel by accessing  less than http://www.intel.com greater than .
The P6 bus is xe2x80x9csuper pipelinedxe2x80x9d in that the groups of signals on the bus which define a given transaction are interleaved with similar signals which define a subsequent transaction. One transaction does not need to complete before another is initiated. There are multiple phases of a transaction on the P6 bus, and each phase is a subset of signals on the bus, but these phases or stages overlap one another. An address for request #1 is put out on the bus, and addresses for requests #2 and #3 go out before the result for #1 comes back. A target of a bus transaction sends back an encoded xe2x80x9cresponsexe2x80x9d that says what the target is going to do, rather than sending the data itself, usually. The response can be a xe2x80x9cretry,xe2x80x9d or that the target is sending the data immediately, or that it is latching a unique ID and it will come back on the bus later and send the data when it is available (a split transaction). Thus, the data completion phases can be out-of-order, for these retry or deferred responses. The preferred mode of operation, often, is to send bursts of data, rather than reads or writes of one 64-bit quadword. For example, if the bridge receives a series of posted writes, these are all posted, and there are a limited number of buffers in the queues of the bridge. In the example, when the address for cache line #1 is put on the bus, preferably the address for cache line #2 immediately follows, but if the request for cache line #1 is retried, then the ordering rules are violated; the rules dictate that #1 has to complete before #2, and if the address for #2 is put out on the bus and it completes in order, its too late, since a retry already is out for #1. To guarantee ordering, it would be necessary to put out address #1, wait until it is known that #1 is not retried or deferred, then put out address #2, etc. This would destroy the benefits of superpipelining on the P6 bus. Now, main memory can usually be accessed in the clock periods allowed on the P6 bus without deferring or retrying; no out of order responses are needed. To the extent that most transactions on the P6 bus are to system memory, it is a penalty to put out the address and the ADS#, wait around for the snoop phase (e.g., six clocks), then put out the next address for a burst; it is known, by the nature of the requests to system memory, that these transactions will complete in order. It is for this reason that the fast burst memory range is employed, as will be explained.
It is therefore one object of the present invention to provide an improved way of handling fast burst transactions on a bus in a computer system.
It is another object of the present invention to provide an improved computer system having enhanced performance when making accesses to devices on an expansion bus, using a bridge between a processor bus and an expansion bus.
It is a further object of the present invention to provide an improved bridge circuit for connecting a processor bus to an expansion bus, particularly one allowing fast burst transactions.
The above as well as additional objects, features, and advantages of the present invention will become apparent in the following detailed written description.
According to one embodiment of the invention, a computer system has a processor bus under control of the microprocessor itself, and this bus communicates with main memory, providing high-performance access for most cache fill operations. In addition, the system includes one or more expansion buses, preferably of the PCI type in the example embodiment. A host-to-PCI bridge is used for coupling the processor bus to the expansion bus. Other buses may be coupled to the PCI bus via PCI-to-(E)ISA bridges, for example. The host-to-PCI bridge contains queues for posted writes and delayed read requests. All transactions are queued going through the bridge, upstream or downstream. The system bus is superpipelined, in that transactions overlap. According to a feature of the invention, provision is made for fast burst transactions, i.e., read requests which can be satisfied without deferring or retrying are applied to the system bus without waiting for the snoop phase. A range of addresses (e.g.,system memory addresses) is defined to be a fast burst range, and any address in this range is treated differently compared to addresses outside the range. The bridge is programmed, by configuration cycles, to establish this fast burst range, within which it is known that an out-of-order response will not be received. Because it is known there will be no out-of-order responses, the initiator (processor) can send out a burst of eight write transactions in quick succession, knowing that all will complete in order. The range values are stored in configuration registers in the bridge, written at the time the machine is turned on; the boot up includes interrogating the main memory to see what its range is, then that range is programmed into the bridge. Thereafter, when a transaction reaches the bridge interface from the expansion bus, and it is recognized that the address is within the range, then the fast burst mode is allowed, and write addresses are allowed to follow one another without the usual delay.