1. Field of the Invention
The present invention relates generally to the allocation of communication bandwidth to a plurality of devices coupled to a communication medium and, more particularly, to a fair bandwidth allocation scheme applicable to devices interconnected in daisy-chain fashion via a plurality of point-to-point communication links.
2. Background of the Related Art
This section is intended to introduce the reader to various aspects of art which may be related to various aspects of the present invention which are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present invention. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
Many computer systems generally have been designed around a shared bus architecture, in which one or more host processors and a host memory are coupled to a shared host bus. Transactions between processors and accesses to memory all occur on the shared bus. Such computer systems typically include an input/output (I/O) subsystem which is coupled to the shared host bus via an I/O bridge which manages information transfer between the I/O subsystem and the devices coupled to the shared host bus. Many I/O subsystems also generally follow a shared bus architecture, in which a plurality of I/O or peripheral devices are coupled to a shared I/O bus. The I/O subsystem may include several branches of shared I/O buses interconnected via additional I/O bridges.
Such shared bus architectures have several advantages. For example, because the bus is shared, each of the devices coupled to the shared bus is aware of all transactions occurring on the bus. Thus, transaction ordering and memory coherency is easily managed. Further, arbitration among devices requesting access to the shared bus can be simply managed by a central arbiter coupled to the bus. For example, the central arbiter may implement an allocation algorithm to ensure that each device is fairly allocated bus bandwidth according to a predetermined priority scheme. Such a priority algorithm may be a xe2x80x9cround-robinxe2x80x9d algorithm that provides equal bandwidth to each of the devices requesting access to the shared bus.
Shared buses, however, also have several disadvantages. For example, the multiple attach points of the devices coupled to the shared bus produce signal reflections at high signal frequencies which reduce signal integrity. As a result, signal frequencies on the bus are generally kept relatively low to maintain signal integrity at an acceptable level. The relatively low signal frequencies reduce signal bandwidth, limiting the performance of devices attached to the bus. Further, the multiple devices attached to the shared bus present a relatively large electrical capacitance to devices driving signals on the bus, thus limiting the speed of the bus. The speed of the bus also is limited by the length of the bus, the amount of branching on the bus, and the need to allow turnaround cycles on the bus. Accordingly, attaining very high bus speeds (e.g., 500 MHz and greater) is difficult in more complex shared bus systems.
The problems associated with the speed performance of a shared bus system may be addressed by implementing the bus as a packet-based bidirectional communication link comprising a plurality of sets of unidirectional point-to-point links. Each set of unidirectional links interconnects two devices such that multiple devices can be connected in daisy-chain fashion. In such an I/O subsystem, a daisy chain of I/O devices can be connected to the host subsystem through a host bridge. The host bridge is connected to the first device in the daisy chain via a set of unidirectional links. The first device functions as a forwarding device (i.e., a repeater) to relay packets received from the host bridge to the next device, and so on down the chain of devices. Similarly, each device can forward packets received from other devices up the chain to the host bridge. In addition to forwarding packets, each device also can issue its own packets into the stream of forwarded packets.
Although the daisy-chain architecture addresses the speed issues associated with a shared bus, special care should be taken in implementing the bus as a series of point-to-point links to ensure that features available in shared bus architectures also are available in the daisy-chain architecture. For example, in a shared bus system, only one device at a time can drive communication packets onto the bus. Thus, transaction ordering is controlled by the order in which the device issuing the packet gains access to the bus. In the shared bus system, all devices can view all transactions on the bus, and thus the devices can be configured to agree upon ordering. In the daisy-chain configuration, however, transactions directed from a first device to a second device cannot be viewed by any other device that is not positioned between the first and second devices in the chain. Accordingly, a transaction management and control scheme should be provided to ensure the appropriate ordering of transactions in the daisy-chained I/O subsystem. For example, to ensure ordering in a daisy-chain system, direct peer-to-peer communications may be prohibited. Instead, all packets may be forced to travel through a common entity (e.g., a host bridge at one end of the chain), which assumes control of ordering issues.
In addition to ordering issues, the daisy-chain architecture offers challenges in ensuring fair allocation of bus bandwidth to the devices connected to the daisy chain. As discussed above, in the shared bus system, bandwidth typically is allocated by a central arbiter coupled to the shared bus. The central arbiter implements an allocation algorithm that balances available bandwidth among devices which are currently requesting access to the bus. In the daisy-chain environment, however, it is not possible to provide a central arbiter and, thus, bus arbitration is distributed among all the devices connected in the chain. Further, if the ordering scheme dictates that all packets should be routed through a bridge device, then devices both forward packets received from other devices and insert locally generated packets onto one of the point-to-point links in a direction toward the bridge device. In a system implementing such an ordering scheme, the allocation of bandwidth must take into account the number of local packets a particular device may insert relative to the number of received packets the device forwards. The ratio of inserted packets to forwarded packets is referred to as the xe2x80x9cinsertion ratexe2x80x9d of a particular device. Because the devices are connected in a daisy chain, the ratio of local packets to forwarded packets at any one device may vary considerably depending on the device""s position in the chain.
In the daisy-chain environment, each device sees a transmit bandwidth determined by flow control from the next device in the chain. However, each device is left to independently determine its own insertion rate within its transmit bandwidth. That is, each device independently allocates its transmit bandwidth between received packets the device is forwarding and locally generated packets the device is inserting in a stream of packets on a particular point-to-point link. If each device is allowed to insert packets at will, such an allocation scheme ultimately leads to marked and unpredictable imbalances in bandwidth allocation among the devices, as well as potential stalls of requests issued by devices. The imbalances can be particularly pronounced in systems having a large number of daisy-chained devices.
The problems of the distributed allocation scheme may be addressed by implementing a static insertion rate allocation scheme. That is, each device may be assigned a fixed insertion rate based on preconceived assumptions about device communication patterns. However, such an a priori allocation scheme also may result in non-optimal usage of bandwidth, because the static rate allocation does not allow the devices to adapt to changes in communication patterns.
Accordingly, it would be desirable to provide a bandwidth allocation scheme that results in a fair, or balanced, allocation of bandwidth to the I/O devices connected in a daisy chain. It would further be desirable if such an allocation scheme could dynamically adapt to changes in communication traffic patterns to ensure a more optimal usage of available bandwidth.
The present invention may be directed to one or more of the problems set forth above.
Certain aspects commensurate in scope with the originally claimed invention are set forth below. It should be understood that these aspects are presented merely to provide the reader with a brief summary of certain forms the invention might take and that these aspects are not intended to limit the scope of the invention. Indeed, the invention may encompass a variety of aspects that may not be set forth below.
In accordance with one aspect of the present invention, there is provided a method of allocating bandwidth to a plurality of devices transmitting packets on a communication link implemented as a plurality of point-to-point links. The method comprises the act of determining, at a first device of the plurality of devices, a packet issue rate of a second device connected to the communication link. The packet issue rate corresponds to a number of second local packets generated by the second device and received by the first device on a first point-to-point link relative to a total number of received packets received by the first device on the first point-to-point link, wherein the received packets include the second local packets. The method also comprises the act of matching an insertion rate of the first device to the packet issue rate, wherein the insertion rate corresponds to insertion of first local packets generated by the first device relative to forwarding of the received packets onto a second point-to-point link.
In one embodiment of the invention, each of the plurality of devices is configured to determine the highest packet issue rate of any of the devices which are downstream of that particular device. Each of the plurality of devices matches its insertion rate to the highest packet issue rate determined by that device.
In accordance with another aspect of the present invention, there is provided a method of allocating bandwidth to a plurality of devices transmitting packets on a communication link implemented as a plurality of point-to-point links. The method comprises monitoring a flow of packet at a first device of the plurality of devices, and associating each of the packets in the monitored flow with a respective device of the plurality of devices. Each of the plurality of devices is configured to transmit, onto one of the point-to-point links, local packets associated with that device and received packets associated with other devices onto one of the point-to-point links. Each device has a respective packet issue rate which corresponds to the number of local packets associated with that device relative to the total number of received packets in the monitored flow of packets received at the first device. The method also comprises determining a highest packet issue rate of the respective packet issue rates.
In accordance with still another aspect of the present invention, there is provided a device configured to transmit packets onto a communication link implemented as a plurality of point-to-point links. The device comprises a first interface to receive packets from a first point-to-point link which are transmitted by at least one other device connected to the communication link. The device also comprises a second interface to transmit packets onto a second point-to-point link, the transmitted packets comprising local packets generated by the device and received packets forwarded by the device. The device also includes allocation logic coupled to the first interface and the second interface. Allocation logic is operably coupled to the first and second interfaces and is configured to monitor a flow of packets received from the first point-to-point link, and to determine, based on the monitored flow, a device insertion rate for transmitting local packets relative to forwarding received packets onto the second point-to-point link.