1. Field of the Invention
The present invention generally relates to computer bus architectures and methods for transference of data, and, in particular, relates to bus bridge architectures for connecting two or more buses and for efficient data transference to and from the buses.
2. Description of the Prior Art
The disclosure herein utilizes Peripheral Component Interconnect (PCI) architecture for illustration purposes where the present invention and the embodiments thereof are not limited to this particular bus architecture. The PCI bus is a high performance 32-bit or 64-bit bus with multiplexed address and data lines. It is intended for use as an interconnect mechanism between highly integrated peripheral controller components, peripheral add-in boards, and processor/memory systems, providing high bandwidth throughput demanded by modem graphics-oriented operating systems such as Windows and OS/2. It is typically found in (but not limited to) IBM compatible personal computer systems. The specifications for the PCI bus standard is provided in the following documents and are incorporated herein by reference: PCI Local Bus Specification, revision 2.1; PCI-to-PCI Bridge Specification, revision 1.0; PCI System Design Guide, revision 1.0; and PCI BIOS Specification, revision 2.1. These documents are available from a consortium of industry partners known as the PCI Special Interest Group (SIG) and are collectively referred to as the PCI Specifications in this disclosure.
FIG. 1 shows one implementation of a PCI bus architecture. Here, the central processing unit (CPU) 10 is connected to a Host/PCI cache bridge 12 via a CPU local bus 14. The Host bridge 12 serves as a bridge to other buses, including a memory bus 16 connected to main memory 18 and a PCI bus 20. Via the Host bridge 12 and the PCI bus 20, the CPU is able to communicate with a number of peripheral devices, including an audio device 22, a motion video device 24 and its video memory 26, a SCSI host bus adapter 28 connecting several other SCSI devices, a LAN adapter 30, and a graphics adapter 32 and its video frame buffer 34. The PCI bus 20 can also communicate with other bus types through the use of a bus-specific bridge 36 and the corresponding bus 38.
Typical PCI bus implementations will support up to four add-in board connectors on the motherboard where the connectors are Micro Channel (MC)-style connectors. PCI expansion cards are designed with an edge connector insertable into the add-in board connectors on a motherboard.
However, a system incorporating a single bus has some limitations. For example, a bus can only support a limited number of expansion connectors due to the fact that a bus will not function properly when there are too many electrical loads (i.e. devices) placed on it. Moreover, the devices that populate a particular bus may not be able to co-exist in an efficient manner in a set-up where all the devices demand high levels of bus time--causing an overall degradation in the performance of the system.
These problems can be solved by adding one or more additional PCI buses into the system and re-distributing the device population. The PCI Specifications provides the definition of a PCI-to-PCI bridge device. This device can either be embedded as an integrated circuit on a PCI bus or may be in the form of an add-in card that is pluggable in a PCI expansion connector. The PCI-to-PCI bridge provides a bridge from one PCI bus to another PCI bus, and it causes one electrical load on its host PCI bus. The new PCI bus can then support a number of additional PCI compatible devices and/or PCI expansion connectors. The electrical loading constraint is therefore solved because the loading constraint is on a per bus basis, not on a system basis. Of course, the power supply in the host system must be capable of supplying sufficient power for the load imposed by the additional devices residing on the new bus(es).
The PCI bridge provides a low latency path through which the processor may access PCI devices mapped anywhere in the memory space or the I/O address spaces. It also provides a high bandwidth path allowing PCI masters direct access to the main memory. The bridge may optionally include such functions as data buffering/posting and PCI central functions (e.g. arbitration). Terminology wise, the PCI bus closest to the host processor is referred to as the primary bus, and the PCI bus that resides behind a PCI-to-PCI bridge is referred to as a subordinate bus where the subordinate bus farthest from the host processor is called the secondary bus.
FIG. 2 illustrates an implementation of a PCI bus system with two PCI-to-PCI bridges connecting to two levels of PCI buses. Here, the CPU 50 is directly connected to the host bus 52. The system memory 54 is connected to the host bus 52 via system memory controller 56. A host-to-PCI bridge 58 establishes a connection between a host bus 52 and a downstream subordinate PCI bus 60 where two PCI devices 62 are connected to it. The subordinate PCI bus 60 further connects to another downstream PCI bus 64 via another PCI-to-PCI bridge 66. PCI bus 64, being the furthest from the host bus, is referred to as the secondary bus and is connected to two PCI devices 68. By using PCI-to-PCI bridges to connect to other PCI buses, architectures overcoming the problem of bus overloading and permitting the expansion of buses are created.
The PCI-to-PCI bridge functions as a traffic coordinator between two PCI buses. The bridge never initiates a transaction on either PCI bus on its own. Its job is to monitor each transaction that is initiated on the two PCI buses and to decide whether or not to pass the transaction through to the opposite PCI bus. When the bridge determines that a transaction on one bus needs to be passed to the other bus, the bridge must act as the target of the transaction on the originating bus and as the initiator of the new transaction on the destination bus. The fact that the bridge resides between the initiator and the target is invisible to the initiator as well as to the target. In addition to determining if a transaction initiated on one bus must be passed through to the other, the bridge also supports additional functions as specified by the PCI Specifications. A bridge may also incorporate a set of device-specific, memory-mapped or IO-mapped registers that control its own functionality. In this case, it must recognize and permit accesses to these registers.
To start a transaction from an initiating device to a targeted device, the initiator sends out a set of signals on to the bus. Each device on the bus having been programmed to claim addresses within a specific address range decodes the signal. The device decoding a valid address then sends out a signal claiming the transaction. The signals involved in such a transaction are illustrated in FIG. 3 and explained below.
The clock (CLK) signal is an input to all devices residing on the bus. It provides timing for all transactions, including bus arbitration. The state of all input signals are `don't-care` at all other times. All PCI timing parameters are specified with respect to the rising-edge of the CLK signal. As a result, all actions on the PCI bus are synchronized to the CLK signal.
The cycle frame (FRAME#) signal is driven by the initiator and it indicates the start (when it is first asserted) and duration (the duration of its assertion) of a transaction. The initiator in acquiring bus ownership, by receiving GNT#, samples the FRAME# and IRDY# signals to determine if both signals are de-asserted on the same rising-edge of the CLK signal. Once the bus is acquired, the initiator asserts the FRAME# signal for the duration of the transaction. A transaction may consist of one or more data transfers between the initiator and the addressed target. The FRAME# signal is de-asserted when the initiator is ready to complete the final data phase.
The initiator Ready (IRDY#) signal is driven by the current bus master (the initiator of the transaction). During a write operation, IRDY#-asserted indicates that the initiator is driving valid data onto the data bus. During a read operation, IRDY#-asserted indicates that the initiator is ready to accept data from the currently-addressed target.
The Target Ready (TRDY#) signal is driven by the currently-addressed target device. It is asserted when the target is ready to complete the current data phase (data transfer). A data phase is completed when the target is asserting TRDY# and the initiator is asserting IRDY# at the rising-edge of the CLK signal. During a read operation, TRDY#-asserted indicates that the target is driving valid data onto the data bus. During a write operation, TRDY# asserted indicates that the target is ready to accept data from the master. Wait states are inserted in the current data phase until both TRDY# and IRDY# are sampled asserted.
The Initialization Device Select (IDSEL) signal (not shown) is an input to the PCI device and is used as a chip select signal during an access to one of the device's configuration registers.
The Device Select (DEVSEL#) signal is asserted by a target when the target has decoded the address and determined that it is the target of the current transaction. It acts as an input to the initiator. If a master initiates a transfer and it does not detect DEVSEL# active in four CLK periods, it must assume that the target cannot respond or that the address is unpopulated. The DEVSEL# signal may be driven one, two, three or four clock cycles following the address phase as shown, and they are defined as fast (one clock cycle), medium (two clock cycles), slow (three clock cycles), or subtractive timing (four clock cycles). By definition, all PCI device address decoders are fast, medium, or slow. The timing is selected by the target in accordance with the target's ability to respond to the transaction. A target having the ability to provide a response in one clock cycle following the address phase will assert fast DEVSEL# timing. Slower devices will assert medium or slow DEVSEL# timing. Ideally, faster throughput is achieved when all devices on a bus asserts fast DEVSEL# timing.
The bus bridge may claim transactions in one of two situations. In the first situation, also known as subtractive decoding, when a transaction is not claimed by any other PCI device within the third clock period of time, the bus bridge may assert the DEVSEL# signal and pass the transaction through to the subordinate bus. The bus bridge can determine that no other PCI device has claimed a transaction by monitoring the state of the DEVSEL# signal generated by the other PCI-compliant devices. If the DEVSEL# signal is not sampled asserted within three clock periods after the start of a transaction, no other PCI device has claimed the transaction. The bus bridge may then claim the transaction by asserting the DEVSEL# signal at the fourth clock cycle of the transaction (subtractive decoding).
In the second situation, the bus is configured to employ positive address decoding. During system configuration, the bridge is configured to recognize certain memory and/or IO address ranges. Upon recognizing an address within this pre-assigned range, the bridge may assert DEVSEL# immediately (without waiting for the DEVSEL# signal to timeout) to claim the transaction. The bridge then passes the transaction through onto the subordinate bus. In this fashion, transaction to the subordinate bus is not hampered by having to wait at least three clock periods before the bus bridge claims the transaction and passes the transaction onto the subordinate bus. The ISA bus environment is one that depends heavily on subtractive decoding to claim transactions.
The devices that reside behind a bus bridge may consist of only memory, only I/O, or a combination of memory and I/O devices. Furthermore, some of the I/O devices may be mapped into memory space while others are mapped into I/O space. The configuration program automatically detects the presence, type and address space requirements of these devices and allocates space to them by programming their address decoders to recognize the address ranges it assigns to them. The corresponding address ranges such as memory, I/O prefetchable memory, ISA/VGA, and I/O Legacy addresses nevertheless poses a complicated decoding problem for the bus bridge. In order to simplify the decoding process, bus bridges tend to offer medium speed DEVSEL# timing regardless of the devices and their respective optimal DEVSEL# timing. As a consequence, optimal throughput from the subordinate bus is not achieved.
Furthermore, by programming the bridge at one specific DEVSEL# timing, it poses limitations as to the DEVSEL# timing speeds for other devices on the same bus as the bus bridge device. Ideally, it would be desirable to be able to program the DEVSEL# timing speed for each device independent of the other devices. In this manner, maximum timing flexibility and the most efficient timing speed for each of the respective devices can be achieved.
It is therefore desirable to have a method and apparatus for achieving efficient device select speed for the devices on the subordinate bus(es).