1. Technical Field
The present invention relates to bus and computer architectures and, more specifically, to maximizing throughput rates on synchronized buses.
2. Related Art
Microprocessor-based systems typically include the processor itself, as well as memory and input/output (I/O) devices. Typically, the processor, memory and I/O devices communicate with each other over a bus. A bus is a shared communication link that uses a plurality of wires to connect various devices or systems. By defining a single connection scheme, new devices can be added and peripherals may be moved between computer systems that share the same type of bus. Thus, buses are cost effective because a single set of wires or traces, if formed in an integrated package, may be shared by multiple devices for exchanging communication signals.
One disadvantage of utilizing bus architectures is that they create communication bottlenecks, thereby limiting the maximum throughput of communication signals. When each device must communicate over a single bus, that bus must be shared. The bandwidth of the bus limits the I/O throughput, as well as the number of devices and amount of data that must be transmitted thereon. In high-end commercial systems and in supercomputers, where I/O rates must be very high because processor performance is high, designing a bus system capable of meeting the demands of the processor, as well as connecting large numbers of I/O devices, represents a significant challenge. Accordingly, it is frequently a design goal to maximize the bus throughput.
One of the reasons why bus design is difficult is that the bus speed and capacity is limited by many factors, including the length of the bus and the number of devices on the bus. Each device that is coupled to the bus adds to the capacitance of the bus thereby increasing the bus impedance and limiting throughput. The physical limits, therefore, prevent the bus from running arbitrarily fast. Moreover, the individual devices on the bus have their own clock rates. Thus, between two devices, the slowest device limits the speed of the communications. Often, because of this, bus designs are implemented in a way that slows the speed of the bus to match that of the slowest device on the bus to insure that all devices can communicate over the bus. While limiting the bus speed to the speed of the slowest device solves the interoperability issues that correspond with the differences in speed, the goal of maximizing bus throughput and high bandwidth are compromised to achieve this result.
A bus is more than just a twisted pair, as is commonly thought. Typically, a bus contains a set of control lines, as well as a set of data lines. The control lines are used to signal requests and acknowledgments, and to indicate what type of information is on the data lines. The data lines carry information between the source and destination for a specific operation. This information can be formed of data, commands, and addresses.
The control lines are used to indicate what type of information is contained on the data lines at each point in the transfer. Some buses even include two sets of signal lines to separately communicate both data and address in a single bus transmission. With either design, however, the control lines are used to indicate what type of data the bus contains and the bus protocol implemented. Because the bus is shared, the protocol also must determine the priority to which the various devices are allowed to communicate on the bus.
There are different types of buses that are commonly used. Processor-memory buses, I/O buses and backplane buses are each designed to achieve certain design goals. Processor-memory buses are short, high-speed buses that are matched to the memory system so as to maximize processor-memory bandwidth. I/O buses, on the other hand, can be long and can have many types of devices connected to them. I/O buses often have a wide range in the data bandwidth of the devices connected to them and a wide range of the speed at which the bus may operate.
As was discussed before, traditionally, the bus speed is matched to that of the slowest device. I/O buses do not typically interface directly to the memory. I/O devices typically use a processor-memory or a backplane bus to connect to memory. Backplane buses are designed to allow processors, memory and I/O devices to co-exist on a single bus. They balance the demands of processor/memory communication with the demands of I/O device-memory communication. Backplane buses are often built into the “backplane” or interconnection structure within the chassis of the box. Processor-memory and I/O boards are then plugged into the backplane using the bus for communication.
Processor-memory buses are often designed for a specific function, namely, to enable a processor to have fast access to memory registers. I/O buses and backplane buses, in contrast, are frequently reused in different machines with different devices connected thereto. Backplane and I/O buses are often considered to be standard buses in that they are used by many computers made by different manufacturers. In contrast, processor-memory buses are often proprietary and not readily accessible by other devices.
During the design phase of a bus architecture, the designer of a processor-memory bus knows all of the types of devices that must connect to the bus. As was discussed before, however, I/O and backplane buses must be designed to handle unknown devices that vary in latency and bandwidth. Normally, the I/O bus presents a simple and low-level interface to a device requiring minimal additional electronics to interface with the bus.
The substantial differences between the design and implementation of processor-memory buses and I/O or backplane buses lead to two different communication schemes on the bus. Synchronous buses include a clock in the control lines and a fixed protocol for communicating that is relative to the clock. Asynchronous buses, on the other hand, are not clocked.
Because synchronized buses communicate relative to a clock that is specified in the control lines, the overhead logic for achieving a successful communication is simpler than with asynchronous buses. For example, in a processor-memory bus performing a read operation from memory, the protocol might include transmitting the address and read commands on a first clock cycle and using control lines at the same time to indicate the type of request. The memory then responds with the data word on the fifth clock cycle. This type of protocol may be implemented easily with a state machine. Because the protocol is specified and involves little logic, the bus can run fast with little overhead.
Synchronous buses have two disadvantages, however. First, the conventional wisdom is that every device on a bus must run at the same clock rate. Second, because of clock skew that can result from many factors, including line impedance, synchronous buses can not be very long if they are fast. As was discussed before, adding devices to a bus increases the capacitance of the bus. Therefore, as a bus gets longer with more devices, the increased capacitance increases the bus impedance thereby reducing the bus speed.
Asynchronous buses, on the other hand, are not clocked. Thus, the problems of clock skew and synchronization do not exist on asynchronous buses. However, to coordinate the transmission of data between a sender and receiver, a more complex hand-shaking protocol must be used. The hand-shaking protocol comprises a series of steps in which the sender and receiver proceed to the next step only when both parties agree as to a transaction that just took place.
As was discussed before, it is desirable to increase the throughput and bandwidth of a bus. The types of factors that affect the throughput capacity of the bus include the speed of the bus, the data bus width, separation of address and data lines, and the ability of a bus to implement block transfers. To implement these enhancements that improve bus throughput, however, additional bus lines, increased complexity, or increased response time for requests may be incurred, especially for long block transfers.
Another issue that is significant in designing a bus is the determination as to how a device reserves bus resources for a given transaction. Without a bus allocation mechanism, multiple devices may desire to communicate with each other simultaneously which could result in collisions between communication signals. While there are bus designs that allow for collisions, such designs are not common. Accordingly, bus designs typically have at least one bus master to control the bus operations. A bus master controls access to the bus, meaning that it initiates and controls all bus requests. The processor must be able to initiate a bus request for memory and thus is always a bus master. The memory, typically, is a slave since it responds to read and write requests that are never generated by its own request.
The simplest designs include buses that have only one bus master. Having a single bus master is analogous to a meeting in which each person is only allowed to speak when he or she is asked a question. An advantage of this approach is simplicity. A disadvantage is that a single bus master must be involved in every transaction upon the bus, thereby increasing overhead and reducing efficiency. Furthermore, this design cannot address the requirements of systems which naturally possess multiple masters such as multiprocessor systems or systems containing Direct Memory Access (DMA) devices.
An alternative approach is a design which includes multiple bus masters in which each bus master is able to initiate a transfer. One issue that arises from having multiple bus masters, however, is that a protocol must be developed for deciding which bus master will next get to utilize bus resources for a communication or transaction. Thus, a mechanism for arbitrating access to the bus must be implemented.
There are a variety of schemes for bus arbitration. These schemes include implementing specialized hardware devices that function as bus arbiters. A bus arbiter may implement very sophisticated bus protocols that each master may follow to determine who has priority. In those systems in which a hardware device is used as a bus arbiter, any master wanting to use the bus must generate a bus request to the arbiter and must wait at least until the request is granted. After a grant is received, the device (bus master) can use the bus. The master then signals the arbiter when the bus is no longer required or, alternatively, when the present transaction is complete. Once the bus is no longer required, the arbiter may grant the bus to another master or device. Most multiple master buses have a set of bus lines between each master and the arbiter for performing the requests and grants. A bus release line is often utilized for each device that does not have its own request line. In some networks, the signals used for bus arbitration have physically separate lines, while in other systems the data and control lines on the bus are used for this function.
Many arbitration schemes may be implemented to determine which bus master has priority. A simple approach for determining priority is to use a round-robin mechanism in which all masters are allowed to have the bus one after another in a fixed order. Another scheme may implement a First-In-First-Out (FIFO) protocol wherein the bus requests are merely queued and honored in the order received. In a more complex scheme, the priority of the communication may be determinative as to what master receives priority to a bus. For example, certain types of transactions may be deemed to be higher priority than other transactions. But even designs where priority is considered, a staleness of a request is often considered so that a low priority request is not ignored, potentially, forever.
Even with the various bus topologies, the fundamental characteristic of the bus still exists. As was stated before, the conventional wisdom is that the synchronous bus speed is set to match the speed of the slowest transaction that may occur on the bus. What is needed, therefore, is a bus protocol and design that facilitates increasing the speed of the buses while not overlooking the restriction requirements placed by the various devices coupled to communicate on the bus.