The explosive growth of Internet traffic has been caused by the increased number of Internet users, various service demands from those users, the implementation of new services, such as voice-over-IP (VoIP) or streaming applications, and the development of mobile Internet. Conventional routers, which act as relaying nodes connected to subnetworks or other routers, have accomplished their roles well, in situations in which the time required to process packets, determine their destinations, and forward the packets to the destinations is usually smaller than the transmission time on network paths. More recently, however, the packet transmission capabilities of high-bandwidth network paths and the increases in Internet traffic have combined to outpace the processing capacities of conventional routers. Thus, routers are increasingly blamed for major bottlenecks in the Internet.
Early routers were implemented on a computer host so that the CPU of the host performed all managerial tasks, such as packet forwarding via a shared bus and routing table computation. This plain architecture proved to be inefficient, due to the concentrated overhead of the CPU and the existence of congestion on the bus. As a result, router vendors developed distributed router architectures that provide efficient packet processing compared to a centralized architecture. In a distributed router architecture, many of the functions previously performed by the centralized CPU are distributed to the line cards and the shared bus is replaced by a high-speed crossbar switch.
FIG. 1 illustrates distributed router 100 according to an exemplary embodiment of the prior art. Distributed router 100 interfaces with different types of networks, including optical networks (OC-192), asynchronous transfer mode (ATM) networks, and Gigabit Ethernet, among others. Distributed router 100 comprises line card modules (LCMS) 111-113, switch fabric 130, routing processor 140, and line card modules (LCMS) 151-153. LCM 111, LCM 112, and LCM 113 contain forwarding table (FT) 121, forwarding table (FT) 122, and forwarding table (FT) 123, respectively. Similarly, LCM 151, LCM 152, and LCM 153 contain forwarding table (FT) 161, forwarding table (FT) 162, and forwarding table (FT) 163, respectively.
Packets coming from adjacent router(s) or subnetworks are received by line card modules 111-113 and line card modules 151-153 and sent to switch fabric 130. Switch fabric 130 switches packets coming from or going to line card modules 111-113 and 151-153 and plays an essential role in relaying packets.
Routing processor 140 builds routing table 141 and maintains the current status of routing table 141 by updating changed routes immediately. Routing processor 140 maintains routing table 141 by running a routing protocol, such as Routing Information Protocol (RIP), Open Shortest Path First (OSPF), or Border Gateway Protocol (BGP). Forwarding tables 121-123 and forwarding tables 161-163 support an efficient lookup in each line card and are downloaded from routing table 141 of routing processor 140. If an incoming packet from a line card module cannot find its destination path from the forwarding table, the corresponding packet may be passed through switch fabric 130 toward a pre-defined default route, or may be silently discarded at the line card.
The main reason for router manufacturers to favor distributed architecture is the simplicity of using a centralized processor to manage one routing table in a consistent way. On the other hand, although the separation of routing and forwarding functions enables high-speed packet processing, the introduction of QoS-capable routing service and the route delays caused by network instability demand even greater packet processing capacity, thereby resulting in additional overhead for the routing processor or instability in the router itself.
A large number of small routers can operate in concert (i.e., in parallel) if an efficient set of interoperability rules is established. The industry has avoided this coordination problem by using a single routing server to handle the routing problems. Therefore, it bounds both the scale of the router and its maximum performance to the scale of available microprocessor processing capacity. Another approach to the problem uses a massively parallel router has a distributed architecture that implements an efficient packet routing protocol without bounding the router and its maximum performance to the scale of available microprocessor processing capacity.
A massively parallel router comprises a plurality of input-output processor units. An input-output processor unit is an example of a device that comprises a plurality of individual processors coupled together in a processor array. In order to efficiently operate an input-output processor unit it is necessary to synchronize the operation of each of the individual processors within the processor array.
Prior art methods of synchronizing operations between individual processors in processor array typically have involved the steps of exchanging state information and utilizing specialized multiprocessor locking mechanisms. A major disadvantage of the prior art techniques is the complications that are introduced into the operation software.
Therefore, there is a need in the art for an improved system and method for synchronizing the operation of a plurality of processors within a processor array. In particular, there is a need in the art for an improved system and method for synchronizing the operation of a plurality of processors within an input-output processor unit of a parallel router system.