In its simplest terms, a computer system performs arithmetic operations, manipulates data, and makes decisions. Computers are useful because of their ability to execute a series of instructions, called a program, in order to accomplish certain tasks. Virtually all computer systems include a central processing unit (CPU), a memory, an input/output (I/O), and a bus. Basically, the CPU executes the instructions of the computer program that is stored in memory. The I/O provides an interface between the user and the computer system. And the bus allows the different components of the computer system to communicate with each other.
The computer's bus conveys all the information and signals involved in the computer's operation. One or more busses are used to connect the CPU to the memory and to the input/output elements so that data and control signals can readily be transmitted between these different components. Hence, the bus structure is critical in normal computer operations. When a computer executes its programming, it is imperative that data and information flow as fast as possible. Otherwise, a slow bus architecture acts as a bottleneck which slows down the overall performance of the computer system, regardless of the microprocessor's speed or power.
Clearly, it is imperative that large blocks of data be transferred as expeditiously as possible, especially in hardware applications, such as, graphics adapters, full motion video adapters, SCSI host bus adapters, FDDI devices, etc. In addition to the speed requirements of the hardware devices, popular software applications prevalent today demand extremely fast updates of graphic images in order to move, resize, and update multiple windows without imposing unacceptable delays on the end user. Since the screen images are stored in video RAM, this means that the processor must be able to update and move large blocks of data within video memory very fast. This is especially the case when rendering images in real-time (e.g., video tele-conferencing, simulations, etc.). These devices are just a few of examples of subsystems which benefit substantially from a very fast bus transfer rate. Hence, selecting an appropriate bus architecture is an important element in determining the computer system's overall performance.
In the past, a variety of bus standards have been implemented (e.g., ISA, EISA, Micro Channel bus, VLB, PCI, etc.). The ISA bus was the original standard of the PC industry. The EISA bus, the Micro Channel bus, and the VESA VL bus are all improvements over the ISA bus. However, the PC industry has presently adopted the PCI bus because of its high rate of data transfers. The PCI bus can be accessed at clock speeds approaching that of the host processor's full native speed. The PCI bus achieves its remarkable speed, in part, by performing read and write transfers over the PCI bus in "burst transfers," whereby multiple blocks of data are conveyed for each transmission. A bus "master" arbitrates over which device coupled onto the PCI bus gets access to the PCI bus. FIG. 1 shows a PCI arbiter 108 arbitrating between seven PCI devices 101-107. This results in an extremely fast and efficient transfer of data from the initiating PCI device to the target PCI device.
However, the PCI bus standard suffers from a major drawback in that there is a limited number of bus masters that may be supported. One cannot simply keep tacking on additional devices to an already saturated PCI bus. In a system that already has the maximum number of bus masters, one prior art solution for circumventing this constraint involves implementing a hierarchical bridge. This hierarchical bridge is used to "bridge" the primary PCI bus to a subordinate PCI bus. FIG. 2 shows a prior art bus architecture having two PCI buses 201 and 202 that are interconnected by a bridge 203. Bus 201 is subordinate to, or beneath, bus 202. Additional PCI devices, such as 204 and 205, can now be coupled to PCI bus 501. The PCI-to-PCI bridge 203 functions as a traffic coordinator between these two PCI buses. Bridge 203 monitors each transaction that is initiated on the two PCI buses. It decides whether or not to pass a transaction between these PCI busses. When the bridge 203 determines that a transaction on one bus needs to be passed to the other bus, the bridge must act as the target of the transaction on the originating bus and as the initiator of the new transaction on the destination bus. Thereby, the bridge 203 allows an additional set of bus masters to be extended via the bridge and subordinate PCI bus.
Although this prior art solution enables more bus masters to be added to the system, there are several disadvantages associated with this approach. The first disadvantage is the additional expense for purchasing the hardware in order to implement the PCI to PCI bridge function. Another disadvantage pertains to the system memory transfer latencies incurred by the bus masters on the subordinate bus. These inherent latencies effectively slows down the speed at which data is transferred.
Yet, if a PCI-to-PCI bridge were not used, then too many electrical loads (e.g., PCI devices) placed onto the PCI bus may cause the PCI bus to malfunction. Furthermore, a particular bus master that requires large amounts of bus time in order to achieve good performance, must now share the bus with other bus masters. Demands for bus time by these other masters may degrade the performance of the bus master subsystem.
Thus, there is need for a more efficient method of adding more PCI bus masters to an already fully loaded PCI bus. It would be highly preferable if such a scheme were inexpensive and does not increase memory transfer latencies. The present invention provides such an elegant solution to these problems.