1. Technical Field of the Invention
The present invention relates generally to data communications and more particularly to high-speed wired data communications.
2. Description of Related Art
As is known, communication technologies that link electronic devices are many and varied, servicing communications via both physical media and wirelessly. Some communication technologies interface a pair of devices, other communication technologies interface small groups of devices, and still other communication technologies interface large groups of devices.
Examples of communication technologies that couple small groups of devices include buses within digital computers, e.g., PCI (peripheral component interface) bus, ISA (industry standard architecture) bus, an USB (universal serial bus), SPI (system packet interface) among others. One relatively new communication technology for coupling relatively small groups of devices is the HyperTransport (HT) technology, previously known as the Lightning Data Transport (LDT) technology (HyperTransport I/O Link Specification “HT Standard”). The HT Standard sets forth definitions for a high-speed, low-latency protocol that can interface with today's buses like AGP, PCI, SPI, 1394, USB 2.0, and 1 Gbit Ethernet as well as next generation buses including AGP 8×, Infiniband, PCI-X, PCI 3.0, and 10 Gbit Ethernet. HT interconnects provide high-speed data links between coupled devices. Most HT enabled devices include at least a pair of HT ports so that HT enabled devices may be daisy-chained. In an HT chain or fabric, each coupled device may communicate with each other coupled device using appropriate addressing and control. Examples of devices that may be HT chained include packet data routers, server computers, data storage devices, and other computer peripheral devices, among others.
Of these devices that may be HT chained together, many require significant processing capability and significant memory capacity. Thus, these devices typically include multiple processors and have a large amount of memory. While a device or group of devices having a large amount of memory and significant processing resources may be capable of performing a large number of tasks, significant operational difficulties exist in coordinating the operation of multiple processors. While each processor may be capable of executing a large number operations in a given time period, the operation of the processors must be coordinated and memory must be managed to assure coherency of cached copies. In a typical multi-processor installation, each processor typically includes a Level 1 (L1) cache coupled to a group of processors via a processor bus. The processor bus is most likely contained upon a printed circuit board. A Level 2 (L2) cache and a memory controller (that also couples to memory) also typically couples to the processor bus. Thus, each of the processors has access to the shared L2 cache and the memory controller and can snoop the processor bus for its cache coherency purposes. This multiprocessor installation (node) is generally accepted and functions well in many environments.
However, network switches and web servers often times require more processing and storage capacity than can be provided by a single small group of processors sharing a processor bus. Thus, in some installations, a plurality processor/memory groups (nodes) is sometimes contained in a single device. In these instances, the nodes may be rack mounted and may be coupled via a back plane of the rack. Unfortunately, while the sharing of memory by processors within a single node is a fairly straightforward task, the sharing of memory between nodes is a daunting task. Memory accesses between nodes are slow and severely degrade the performance of the installation. Many other shortcomings in the operation of multiple node systems also exist. These shortcomings relate to cache coherency operations, interrupt service operations, etc.
While HT links provide high-speed connectivity for the above-mentioned devices and in other applications, they are inherently inefficient in some ways. For example, in a “legal” HT chain, one HT enabled device serves as a host bridge while other HT enabled devices serve as dual link tunnels and a single HT enabled device sits at the end of the HT chain and serves as an end-of-chain device (also referred to as an HT “cave”). According to the HT Standard, all communications must flow through the host bridge, even if the communication is between two adjacent devices in the HT chain. Thus, if an end-of-chain HT device desires to communicate with an adjacent HT tunnel, its transmitted communications flow first upstream to the host bridge and then flow downstream from the host bridge to the adjacent destination device. Such communication routing, while allowing the HT chain to be well managed, reduces the overall throughput achievable by the HT chain, increases latency of operations, and reduces concurrency of transactions.
Applications, including the above-mentioned devices, that otherwise benefit from the speed advantages of the HT chain are hampered by the inherent delays and transaction routing limitations of current HT chain operations. Because all transactions are serviced by the host bridge and the host a limited number of transactions it can process at a given time, transaction latency is a significant issue for devices on the HT chain, particularly so for those devices residing at the far end of the HT chain, i.e., at or near the end-of-chain device. Further, because all communications serviced by the HT chain, both upstream and downstream, must share the bandwidth provided by the HT chain, the HT chain may have insufficient total capacity to simultaneously service all required transactions at their required bandwidth(s). Moreover, a limited number of transactions may be addressed at any time by any one device such as the host, e.g., 32 transactions (2**5). The host bridge is therefore limited in the number of transactions that it may have outstanding at any time and the host bridge may be unable to service all required transactions satisfactorily. Each of these operational limitations affects the ability of an HT chain to service the communications requirements of coupled devices.
Therefore, a need exists for smart routing of packets to facilitate direct peer-to-peer communications in a point-to-point link based system.