Utilization of a bus is a well known technique for allowing shared communication between multiple devices. For example, in a bus, all communication devices share a common path. Bus architectures work similar to a highway system, where a wide major highway may provide simple and affordable access to many cities and towns located along the highway. A single bus provides shared access to many devices along the shared path (bus).
As more and more cars enter a highway from the many towns along its path, the highway becomes more crowded and traffic flows more slowly. Similarly, various types of bus architectures can suffer from similar difficulties. There are two elements to performance of a bus, throughput and latency. The throughput of data is simply the speed with which data will travel along the bus, and could be considered analogous to the speed limit on a highway. For example, a 100 Mbit Ethernet adapter can transport data with a throughput of 100 megabits per second. Latency is defined as the time required to pass a frame (e.g., for Ethernet, a preamble, start-frame delimiter, destination address, source address, length and type field, data, and frame-check sequence) from the source to the destination on the bus. When there is more traffic on the bus, it becomes increasingly more difficult to find an opening on the bus to place data, causing an increase in the latency. Essentially, more congestion increases latency; the throughput always remains the same for the bus.
The more devices there are connected to the bus, the more traffic (e.g., data) will be flowing along the bus and therefore, the more congested and slow the data or other communications will travel. However, like the highway system, despite drawbacks, current bus architectures have provided an adequate means for interconnecting multiple devices for communication.
There are several different bus architectures used today. Busses such as a PCI bus, illustrated in FIG. 1, use a single multiplexed wide data and address bus, implementing a tree topology, to move data from one device to another. Referring to FIG. 1, a processor 102 and memory 104 are directly connected to a host bridge 106. These elements and their interconnections comprise the “local bus” of the system (e.g., computer, game console, etc.) in which they operate. The host bridge 106 is responsible for bridging between the local bus and the PCI bus, described in more detail below. In addition, the host bridge 106 is responsible for controlling and buffering all the data going to and from the memory 104. In the architecture illustrated in FIG. 1, all devices are memory-mapped, meaning that they can be referenced as if they were part of memory. It is the job of the host bridge 106 to determine whether or not the memory location requested is located in the local memory or on the PCI bus and return or write the data to the correct location.
Host bridge 106 is connected to bus 108. Bus 108 connects the host bridge to PCI devices, e.g., networking adapter 114, video card or hard-drive controller 116, and PCI to ISA bridge 118. These elements and their interconnections comprise the PCI bus. Three representative devices 114, 116, and 118 are shown, but these devices could be any devices such as a networking device, video, hard-drive controller, bridge to PCI, encryption coprocessor, video compression encoder/decoder, etc. These devices may be memory mapped or may implement direct memory access (DMA), which means that a device can automatically read and write data to memory, without requiring the data be handled by the processor 102.
The PCI to ISA bridge 118 couples the PCI bus to the bus 110. Coupled to the ISA bus 110 are input devices 120, slow interface 122, and other slow hardware 124. These elements and their interconnections comprise the ISA bus. All devices on the ISA bus “look” as if they are part of the PCI to ISA bridge 118. All traffic from these devices needs to pass through the PCI to ISA bridge 118 up to the processor 102, since these devices do not implement DMA. Using a PCI bus architecture such as that shown in FIG. 1, one device must take control, or master, the bus. During a transaction to a target, the master is also known as the initiator.
The configuration of the devices through the PCI is performed through a CONFIG command given over the PCI bus. All PCI transactions begin with a command and, just as READ and WRITE operations have their own commands, CONFIG READ and CONFIG WRITE also each have their own commands. The CONFIG commands initiate a configuration transaction that either writes to or reads to a standardized set of registers. Configuration is not a line-speed operation, i.e., it does not need to be done at the same rate as the data is processed. The actual processing of the data utilizes the bus, making use of its high throughput potential. Another bus can handle slower speed configuration.
Another type of bus implements a ring topology, such as token ring, illustrated in FIG. 2. As is well known, devices 202, 204, 206 and 208 must arbitrate for time on the bus 210 and then send their data around the “ring” (the bus 210). The ring is connected to each device, and any device can have access to the data at any time on the ring. A “token” 212, which is a special bit pattern, travels around the ring. To send a message, a device “catches” the token, attaches a message to it, and then lets it continue to travel around the network.
Tree topology protocols such as PCI suffer from high complexity and high overhead. Wide busses running at high speeds are usually necessary to reach desired speeds, as the theoretical throughput is much higher than what is generally realized. Long data bursts can cause contention for the bus as other devices seek access. Large buffers are necessary at each device to survive long latencies that can occur when contending for the bus.
Ring topologies such as a token ring require all data to pass by all devices. Each device must check the header of every packet to see if it is the destination. When one device has control over the bus (controls the token in a token ring implementation), other devices must wait until the token is free or until they are a high enough priority to seize the token. While latencies can be deterministic, the throughput can never reach the theoretical maximum as multiple devices attempt to use the ring simultaneously, since throughput can never exceed the allocated time slices.
Other bus architectures implement star topologies, a common example being Ethernet. In a star topology system, each device is commonly connected to a common device, typically some type of bridge, through a dedicated link. To route from one device to another, the bridging device must disassemble the data frame from the sending device and repackage it and then send it on the link to the receiving device. Star topologies such as Ethernet require dedicated physical links for transmission. While they can be high speed, a transaction typically requires several jumps between destinations giving rise to high latency. Star topologies sacrifice latency for high throughput. Star topology is somewhat analogous to methods used for package delivery. For example, Federal Express has an East hub where all packages east of a particular part of the United States are initially sent to before being forwarded to their final destination. By aggregating resources, they are able to handle large quantities of packages, i.e., they have high throughput. However, shorter latency (e.g., longer than next-day service) would be possible if a truck was always available to drive a package directly from the source to the destination.
Performance of a transaction, regardless of the bus used for the transaction, follows some general guidelines. Busses using higher bandwidth mediums (fiber optics instead of copper; cable TV lines instead of POTS; differential vs. single-ended signaling) will generally allow for increased performance. For a given type of medium, performance will generally be better over shorter distances. In addition, performance is only as good as the slowest component in the link. The slowest component can be slow either by its nature or because it is too busy handling other requests.
Generally, a tree topology typically requires every device to wait when another device is using the tree. A ring topology does not allow one device to inhibit the use of the capacity by the other devices, but it won't allow a device to use more than its share. A star topology requires the devices at the center of the star (in Ethernet, the switch or the router) to be fast enough to handle as much traffic as required. The switch/router must scale proportionally with the number of branches hanging off of it, leading to scalability problems.
Thus, it would be desirable to have an efficient way of selectively creating temporary point-to-point bus connections between devices on an as-needed basis, and to have the capacity for each device to be connected along point-to-point bus connections simultaneously.