Packet switched networks, such as the Internet, divide a message or a data stream transmitted by a source into discrete packets prior to transmission. Upon receipt of the packets by the recipient, the packets are recompiled to form the original message or data stream. As a packet-switched network, the Internet is comprised of various physical connections between computing devices, servers, routers, sub-networks, and other devices which are distributed throughout the network.
Routers connect networks, and each router has multiple inputs and multiple outputs coupled to independent network devices such as servers or other routers, the connections being made through communications links such as optical fibers or copper wires or the like.
Routers receive the packets being sent over the network and determine the next hop or segment of the network to which each packet should be sent through one of the ports of the router. When the router passes the packet to the next destination in the network, the packet is one step closer to its final destination. Each packet includes header information indicating the final destination address of the packet.
Conventionally, routers include memories and microprocessors therein for processing the packets received by the routers, as well as for performing other functions required of the router. Typically, routers contain one or more processors, one or more forwarding engines, and a switch fabric. The route processor is a dedicated embedded subsystem which is responsible for communicating with the neighboring routers in the network to obtain current and ever-changing information about the network conditions. The route processor forms a routing table which is downloaded into and subsequently accessed for forwarding packets by the forwarding engine(s).
The forwarding engine of the router is responsible for determining the destination address and output port within the router to which to direct the received packet, this determination conventionally being made by accessing a routing table containing routing information for the entire network and performing a look-up operation.
One example of a conventional forwarding engine for a router is shown in FIG. 1, wherein a plurality of general purpose CPUs 20 are provided in the architecture for the forwarding engine 22. Each CPU is a separate integrated circuit and receives packet data, and each CPU processes individual packets by performing a forwarding or lookup operation using an external SRAM 24 having a forwarding lookup table stored therein. As packets are received from the network, they are stored in a very large input buffer 26 on the front end of the forwarding engine for temporary storage until a CPU can remove a packet from the buffer and perform the forwarding/lookup operation. Such a system is commonly referred to as being “input striped,” wherein the packets are written into the input buffer sequentially as they are received, but maybe processed in a non-sequential order as the CPUs become available for processing.
Conventionally, determining the destination port within the router to which to send the received packet is a computationally intensive process, particularly in view of the high data rates of the network (known as the “line rate”), such as 10 Giga bits/second. At this line rate, a forwarding engine within a router must make the destination port determination for approximately 30 million minimum sized IP packets per second per port. Accordingly, as the router receives multiple packets, a conventional forwarding engine utilizes the large buffer memory 26 on its front end, as shown in FIG. 1, to temporarily store a number of packets until the path is determined of the packet presently being processed by the forwarding engine.
As such, conventional forwarding engines for routers can be susceptible to performance degradation if the network traffic directed at the router is high, particularly when the router receives a plurality of packets having short lengths, thereby requiring that the look-up operations be performed quickly. Further, the increasing demand for IP-centric services over the Internet, such as voice over IP, streaming video, and data transfers to wireless devices with unique IP addresses, has increased the demand for data handling by the forwarding engines, as well as the size of the forwarding table.
Also, in such a conventional arrangement as shown in FIG. 1, the CPUs 20 each contend for access to the external forwarding table SRAM 24 to perform the lookup operation, which can be problematic in that contention for the external SRAM can provide a bottleneck which limits the system's performance. Conventional routers have a forwarding engine/CPU with an off-chip forwarding table, typically implemented using DRAM and may be 30 Megabytes in size—which is a substantial memory size. Conventionally, it may take many cycles—such as 20 cycles—to look up an address for a packet.
As recognized by the present inventors, what is needed is a cross-bar apparatus or circuit for permitting access by various stages of a forwarding engine to the forwarding table memory so that look up operations can occur efficiently. It is against this background that various embodiments of the present invention were developed.