1. Field of the Invention
This invention relates to computer and network systems and, more particularly, to routing packets in a computer or network system.
2. Description of the Related Art
Generally, personal computers (PCs) and other types of computer systems have been designed around a shared bus system for accessing memory. One or more processors and one or more input/output (I/O) devices are coupled to memory through the shared bus. The I/O devices may be coupled to the shared bus through an I/O bridge that manages the transfer of information between the shared bus and the I/O devices, and processors are typically coupled directly to the shared bus or are coupled through a cache hierarchy to the shared bus.
Unfortunately, shared bus systems may experience several drawbacks. For example, since there are multiple devices attached to the shared bus, the bus is typically operated at a relatively low frequency. The multiple attachments present a high capacitive load to a device driving a signal on the bus, and the multiple attach points present a relatively complicated transmission line model for high frequencies. Accordingly, the frequency remains low, and thus the bandwidth available on the shared bus is relatively low. The low bandwidth presents a barrier to attaching additional devices to the shared bus, since additional devices may negatively impact performance.
Another disadvantage of the shared bus system is a lack of scalability to larger numbers of devices. As mentioned above, the amount of bandwidth is fixed (and may decrease if adding additional devices reduces the operable frequency of the bus). Once the bandwidth requirements of the devices attached to the bus (either directly or indirectly) exceeds the available bandwidth of the bus, devices will frequently be stalled when attempting access to the bus. As a result, overall performance may be decreased.
One or more of the above problems may be addressed by using a distributed memory system. A computer system employing a distributed memory system includes multiple nodes. Two or more of the nodes are connected to memory, and the nodes are interconnected using any suitable interconnect. For example, each node may be connected to each other node using dedicated lines. Alternatively, each node may connect to a fixed number of other nodes, and transactions may be routed from a first node to a second node to which the first node is not directly connected via one or more intermediate nodes. The memory address space is assigned across the memories in each node.
Generally, a “node” is a device which is capable of participating in transactions upon the interconnect. For example, in a packet-based interconnect the node may be configured to receive and transmit packets to other nodes. One or more packets may be employed to perform a particular transaction. A particular node may be a destination for a packet, in which case the information is accepted by the node and processed internally in the node. Alternatively, the particular node may be used to relay a packet from a source node to a destination node if the particular node is not the destination node of the packet.
Distributed memory systems present design challenges that differ from the challenges in shared bus systems. For example, shared bus systems regulate the initiation of transactions through bus arbitration. Accordingly, a fair arbitration algorithm allows each bus participant the opportunity to initiate transactions. The order of transactions on the bus may represent the order that transactions are performed (e.g., for coherency purposes). On the other hand, in distributed memory systems, nodes may initiate transactions concurrently and use the interconnect to transmit the transactions to other nodes. These transactions may have logical conflicts between them (e.g., coherency conflicts for transactions to the same address) and may experience resource conflicts (e.g., buffer space may not be available in various nodes) since no central mechanism for regulating the initiation of transactions is provided. Accordingly, it is more difficult to ensure that information continues to propagate among the nodes smoothly and that deadlock situations (in which no transactions are completed due to conflicts between the transactions) are avoided.
By employing virtual channels and allocating different resources to the virtual channels, conflicts may be reduced. Generally speaking, a “virtual channel” is a communication path for initiating transactions (e.g., by transmitting packets containing commands) between various processing nodes. Each virtual channel may be resource-independent of the other virtual channels (i.e., packets flowing in one virtual channel are generally not affected, in terms of physical transmission, by the presence or absence of packets in another virtual channel). Packets that do not have logical/protocol-related conflicts may be grouped into a virtual channel. For example, packets may be assigned to a virtual channel based upon packet type. Packets in the same virtual channel may physically conflict with each other's transmission (i.e., packets in the same virtual channel may experience resource conflicts), but may not physically conflict with the transmission of packets in a different virtual channel (by virtue of the virtual channels being resource-independent of each other). Accordingly, logical conflicts occur between packets in separate virtual channels. Since packets that may experience resource conflicts do not experience logical conflicts and packets which may experience logical conflicts do not experience resource conflicts, deadlock-free operation may be achieved.
In order to avoid deadlock, virtual channels may need to be able to make progress independently. If not, both logical and resource conflicts may arise between packets, providing an opportunity for deadlock. For example, assume a first virtual channel includes packets that contain requests for data from a memory controller. In order to process each request, the memory controller may send responses to the requests in a second virtual channel. In order to avoid deadlock, neither the first nor the second virtual channel should be able to block each other. However, the first virtual channel may be blocked if packets are unable to progress because the memory controller's queue (for receiving packets in the first virtual channel) is full. In order to process the first packet in the first virtual channel's queue in the memory controller, thus freeing up room in the queue to accept more packets from the first virtual channel, a response to the request in the first packet may need to be sent in the second virtual channel. If responses cannot be sent in the second virtual channel due to the blocked first virtual channel, deadlock may arise.
In addition to deadlock, another concern that may arise when implementing a packet-based system using virtual channels is starvation. Starvation may occur if one interface (e.g., an interface to another node or to a device, like a memory controller, that is internal to a node) in a node is unable to share the available bandwidth in a particular virtual channel. System performance is another concern. Generally, it may be preferable to route “older” packets (i.e., those that have been waiting to be routed for a longer amount of time) before “newer” packets. Physically routing the interconnections between the interfaces and devices within a node may present an additional problem. In general, it may be preferable to have fewer and shorter interconnections. However, each interface within a node may need to be able to send and receive packets in each virtual channel from each other interface in the node. Providing this capability may lead to complex physical interconnections within the node.