1. Field of the Invention
The present invention is directed in general to data communications. In one aspect, the present invention relates to a method and system for packet routing in high-speed data communication systems.
2. Related Art
As is known, communication technologies that link electronic devices are many and varied, servicing communications via both physical media and wirelessly. Some communication technologies interface a pair of devices, other communication technologies interface small groups of devices, and still other communication technologies interface large groups of devices.
Examples of communication technologies that couple small groups of devices include buses within digital computers, e.g., PCI (peripheral component interface) bus, ISA (industry standard architecture) bus, USB (universal serial bus), and SPI (system packet interface). One relatively new communication technology for coupling relatively small groups of devices is the HyperTransport (HT) technology, previously known as the Lightning Data Transport technology (HyperTransport I/O Link Specification “HT Standard”). The HT Standard sets forth definitions for a high-speed, low-latency protocol that can interface with today's buses like AGP, PCI, SPI, 1394, USB 2.0, and 1 Gbit Ethernet as well as next generation buses including AGP 8x, Infiniband, PCI-X, PCI 3.0, and 10 Gbit Ethernet. HT interconnects provide high-speed data links between coupled devices. Most HT enabled devices include at least a pair of HT ports so that HT enabled devices may be daisy-chained. In an HT chain or fabric, each coupled device may communicate with each other coupled device using appropriate addressing and control. Examples of devices that may be HT chained include packet data routers, server computers, data storage devices, and other computer peripheral devices, among others.
Of these devices that may be HT chained together, many require significant processing capability and significant memory capacity. While a device or group of devices having a large amount of memory and significant processing resources may be capable of performing a large number of tasks, significant operational difficulties exist in coordinating the operation of multiprocessors. For example, while each processor may be capable of executing a large number of operations in a given time period, the operation of the processors must be coordinated and memory must be managed to assure coherency of cached copies. In a typical multi-processor installation, each processor typically includes a Level 1 (L1) cache coupled to a group of processors via a processor bus. The processor bus is most likely contained upon a printed circuit board. A Level 2 (L2) cache and a memory controller (that also couples to memory) also typically couples to the processor bus. Thus, each of the processors has access to the shared L2 cache and the memory controller and can snoop the processor bus for its cache coherency purposes. This multi-processor installation (node) is generally accepted and functions well in many environments.
Because network switches and web servers often times require more processing and storage capacity than can be provided by a single small group of processors sharing a processor bus, in some installations, multiple processor/memory groups (nodes) are sometimes contained in a single device. In these instances, the nodes may be rack mounted and may be coupled via a back plane of the rack. Unfortunately, while the sharing of memory by processors within a single node is a fairly straightforward task, the sharing of memory between nodes is a daunting task. Memory accesses between nodes are slow and severely degrade the performance of the installation. Many other shortcomings in the operation of multiple node systems also exist. These shortcomings relate to cache coherency operations, interrupt service operations, etc.
An additional challenge for multiprocessor configurations is the routing of packet data within the multiprocessor devices. For example, routing information for an incoming packet must be calculated upon reception to determine if the packet destination is a destination within that device or if the packet is to be transmitted to another node coupled thereto. Conventional approaches for making routing calculations have required hardwired ASIC circuits, or have been implemented as regular network processors that require the (local) processor(s) to make state machine type determinations for every packet routing decision. In addition to consuming processor resources, the state machine approach can also impose significant buffer storage devices to hold the packet while the routing decision is being made, especially where a subsequent state relies on a prior packet bit.
In addition to the foregoing challenges, an HT enabled device that is incorporated into a system (e.g., an HT enabled server, router, etc. that is incorporated into a circuit-switched system or packet-switched system) must interface with a legacy device that uses an older communication protocol. For example, if a line card were developed with HT ports, the line card would need to communicate with legacy line cards that include SPI ports. Also, where multiple HT enabled nodes are connected through an external HT switch, the routing function can be impeded where the switch disregards packet information.
Therefore, a need exists for methods and/or apparatuses for interfacing devices with an efficient routing scheme while overcoming the bandwidth limitations, latency limitations, limited concurrency, and other limitations associated with the use of a high-speed chain of linked nodes. Further limitations and disadvantages of conventional systems will become apparent to one of skill in the art after reviewing the remainder of the present application with reference to the drawings and detailed description which follow.