There has been explosive growth in Internet traffic due to the increased number of Internet users, various service demands from those users, the implementation of new services, such as voice-over-IP (VoIP) or streaming applications, and the development of mobile Internet. Conventional routers, which act as relaying nodes connected to sub-networks or other routers, have accomplished their roles well, in situations in which the time required to process packets, determine their destinations, and forward the packets to the destinations is usually smaller than the transmission time on network paths. More recently, however, the packet transmission capabilities of high-bandwidth network paths and the increases in Internet traffic have combined to outpace the processing capacities of conventional routers.
This has led to the development of massively parallel, distributed architecture routers. A distributed architecture router typically comprises a large number of routing nodes that are coupled to each other via a plurality of switch fabric modules and an optional crossbar switch. Each routing node has its own routing (or forwarding) table for forwarding data packets via other routing nodes to a destination address.
The Applicants have filed a number of patent applications related to a massively parallel, distributed architecture router in which each of the multiple routing nodes uses two processors—an inbound network processor and an outbound network processor—to forward data packets. The inbound network processor receives data packets from external devices and forwards the received data packets to other routing nodes via the switch fabric and crossbar switch. The outbound network processor receives data packets from the switch fabric and crossbar switch and forwards the received data packets to an external device.
The disclosed inbound and outbound network processors comprise multiple microengines that perform route searches in a shared forwarding table. In an exemplary embodiment, each inbound or outbound network processor comprises a control plane processor (e.g., XScale core processor (XCP)) operating in the control plane and sixteen (16) microengines that route data packets in the data plane. In such an embodiment, the control plane processors of the inbound and outbound network processors perform control plane communications primarily using Local Processor Communications (LPC) over a PCI bus. Also, mechanisms are available inside each network processor to provide internal communications among microengines and control plane processors inside the same network processor.
However, as the data plane functionality becomes distributed between inbound and outbound network processors using shared resources (e.g., shared forwarding table), problems arise with respect to coordinating the allocation and use of these shared resources, as well as synchronizing or coordinating processing across processor boundaries. This situation is further complicated by the fact that there is no mechanism to allow communications directly between microengines in different network processors.
There are two indirect methods for providing some amount of communication between microengines in different network processors. In one method, an originating microengine may send a message to the control plane processor in the same network processor for delivery via LPC to the control plane processor of the destination network processor and subsequent delivery to the terminating microengine. In another method, an originating microengine packetizes the message and hair-pins the packetized message to the destination network processor in the data plane. A microengine in the destination network processor recognizes the message as local and delivers it to the destination microengine. However, both of these methods are highly inefficient.
Moreover, in the previous patent applications filed by the Applicants, the disclosed inbound network processor had only receive interfaces on the external network side and only transmit interfaces on the switch fabric side. Similarly, in the previous patent applications filed by the Applicants, the disclosed outbound network processor had only transmit interfaces on the external network side and only receive interfaces on the switch fabric side.
U.S. patent application Ser. No. 10/665,832, filed on Sep. 19, 2003, entitled “Apparatus and Method for Hairpinning Data Packets in an Ethernet MAC Chip”, disclosed a mechanism for transferring a data packet from an inbound network processor directly to the outbound network processor within the same routing node without using the switching modules and cross-bar switch. If a routing node receives a data packet from an external source device, and both the source device and the destination device are coupled to the same routing node, there is no need to transfer the data packet through the switch fabrics and crossbar switch. Instead, the microengines of the network processor simply transmit the received data packet back out to the external network (i.e., like a “hairpin” turn) without using the switch fabric.
However, U.S. patent application Ser. No. 10/665,832 did not disclose a mechanism for performing a “reverse” hairpinning operation whenever a data packet is improperly received from the switching fabric by the outbound network processor. The routers disclosed by the Applicants in previous patent applications sometimes implement draconian route summarization. Such route summarization was disclosed in U.S. patent application Ser. No. 10/832,010, filed on Apr. 26, 2004, entitled “Apparatus and Method for Route Summarization and Distribution in a Massively Parallel Router.” This route summarization sometimes leads to misrouting, so that the switch fabric may incorrectly deliver a data packet to the outbound network processor of the wrong routing node.
However, due to the lack of a transmit interface from the outbound network processor back to the switch modules and crossbar switch, the hair-pinning mechanism of U.S. patent application Ser. No. 10/665,832 cannot be used. Moreover, using the control plane processors to transfer the misrouted data packet back to the switch is not feasible because of the high data rate (i.e., 10 Gbps) involved.
Therefore, there is a need in the art for an improved high-speed router that provides direct communications between the microengines of the inbound and outbound network processors of a routing node. There is also a need for a router that enables a misrouted data packet received by the outbound network processor to be transferred directly from the outbound network processor to the inbound network processor for subsequent forwarding back to the switch fabric.