This invention relates to computer systems, and more particularly to a memory access protocol for a computer system bus which uses a bridge between a processor bus and a standardized system bus.
Computer systems of the PC type usually employ a so-called expansion bus to handle various data transfers and transactions related to I/O and disk access. The expansion bus is separate from the system bus or from the bus to which the processor is connected, but is coupled to the system bus by a bridge circuit.
For some time, all PC's employed the ISA (Industry Standard Architecture) expansion bus, which was an 8-MHz, 16-bit device (actually clocked at 8.33 MHz). Using two cycles of the bus clock to complete a transfer, the theoretical maximum transfer rate was 8.33 MBytes/sec. Next, the EISA (Extension to ISA) bus was widely used, this being a 32-bit bus clocked at 8-MHz, allowing burst transfers at one per clock cycle, so the theoretical maximum was increased to 33-MBytes/sec. As performance requirements increased, with faster processors and memory, and increased video bandwidth needs, a high performance bus standard was a necessity. Several standards were proposed, including a Micro Channel architecture which was a 10-MHz, 32-bit bus, allowing 40-MByte/sec, as well as an enhanced Micro Channel using a 64-bit data width and 64-bit data streaming, theoretically permitting 80-to-160 MByte/sec transfer The requirements imposed by use of video and graphics transfer on networks, however, necessitate even faster transfer rates. One approach was the VESA (Video Electronics Standards Association) bus which was a 33 MHz, 32-bit local bus standard specifically for a 486 processor, providing a theoretical maximum transfer rate of 132-MByte/sec for burst, or 66-MByte/sec for non-burst; the 486 had limited burst transfer capability. The VESA bus was a short-term solution as higher-performance processors, e.g., the Intel P5 and P6 or Pentium and Pentium Pro processors, became the standard.
The PCI (Peripheral Component Interconnect) bus was proposed by Intel as a longer-term solution to the expansion bus standard, particularly to address the burst transfer issue. The original PCI bus standard has been upgraded several times, with the current standard being Revision 2.1, available from a trade association group referred to as PCI Special Interest Group, P.O. Box 14070, Portland, Oreg. 97214. The PCI Specification, Rev. 2.1, is incorporated herein by reference. Construction of computer systems using the PCI bus, and the PCI bus itself, are described in many publications, including "PCI System Architecture," 3rd Ed., by Shanley et al, published by Addison-Wesley Pub. Co., also incorporated herein by reference. The PCI bus provides for 32-bit or 64-bit transfers at 33- or 66-MHz; it can be populated with adapters requiring fast access to each other and/or with system memory, and that can be accessed by the host processor at speeds approaching that of the processor's native bus speed. A 64-bit, 66-MHz PCI bus has a theoretical maximum transfer rate of 528-MByte/sec. All read and write transfers over the bus can be burst transfers. The length of the burst can be negotiated between initiator and target devices, and can be any length.
System and component manufacturers have implemented PCI bus interfaces in various ways. For example, Intel Corporation manufactures and sells a PCI Bridge device under the part number 8245OGX, which is a single-chip host-to-PCI bridge, allowing CPU-to-PCI and PCI-to-CPU transactions, and permitting up to four P6 processors and two PCI bridges to be operated on a system bus. Another example is offered by VLSI Technology, Inc., is a PCI chipset under the part number VL82C59x SuperCore, providing logic for designing a Pentium based system that uses both PCI and ISA buses. The chipset includes a bridge between the host bus and the PCI bus, a bridge between the PCI bus and the ISA bus, an a PCI bus arbiter. Posted memory write buffers are provided in both bridges, and provision is made for Pentium's pipelined bus cycles and burst transactions.
The "Pentium Pro" processor, commercially available from Intel Corporation, uses a processor bus structure as defined in the specification for this device, particularly as set forth in the publication "Pentium Pro Family Developer's Manual" Vols. 1-3, Intel Corp., 1996, available from McGraw-Hill, and incorporated herein by reference; this manual is also available from Intel by accessing &lt;http://www.intel.com&gt;.
The P6 bus is "super pipelined" in that the groups of signals on the bus which define a given transaction are interleaved with similar signals which define a subsequent transaction. One transaction does not need to complete before another is initiated. There are multiple phases of a transaction on the P6 bus, and each phase is a subset of signals on the bus, but these phases or stages overlap one another. An address for request #1 is put out on the bus, and addresses for requests #2 and #3 go out before the result for #1 comes back. A target of a bus transaction sends back an encoded "response" that says what the target is going to do, rather than sending the data itself, usually. The response can be a "retry," or that the target is sending the data immediately, or that it is latching a unique ID and it will come back on the bus later and send the data when it is available (a split transaction). Thus, the data completion phases can be out-of-order, for these retry or deferred responses. The preferred mode of operation, often, is to send bursts of data, rather than reads or writes of one 64-bit quadword. For example, if the bridge receives a series of posted writes, these are all posted, and there are a limited number of buffers in the queues of the bridge. In the example, when the address for cache line #1 is put on the bus, preferably the address for cache line #2 immediately follows, but if the request for cache line #1 is retried, then the ordering rules are violated; the rules dictate that #1 has to complete before #2, and if the address for #2 is put out on the bus and it completes in order, its too late, since a retry already is out for #1. To guarantee ordering, it would be necessary to put out address #1, wait until it is known that #1 is not retried or deferred, then put out address #2, etc. This would destroy the benefits of superpipelining on the P6 bus. Now, main memory can usually be accessed in the clock periods allowed on the P6 bus without deferring or retrying; no out of order responses are needed. To the extent that most transactions on the P6 bus are to system memory, it is a penalty to put out the address and ADS#, wait around for the snoop phase (e.g., six clocks), then put out the next address for a burst; it is known, by the nature of the requests to system memory, that these transactions will complete in order. It is for this reason that the fast burst memory range is employed, as will be explained.