This invention relates to a data packet processing system and method for a router, and more particularly, to a data packet processing system and method for a router which uses a packet switching software for routing data packets between data networks.
Generally, a data packet system consists of a number of interconnected data networks that provide for communication among computers using standard protocols. For example all internet transport protocols use the Internet Protocol (IP) to transport data from a source computer to a destination computer. As each computer may be located on a different network, the internet model defines an IP router that interconnects the constituent networks by providing IP datagram forwarding facilities between networks. Forwarding of the IP datagram requires the router to lookup the destination address in a forwarding table to identify which interface the IP datagram is to be forwarded.
The forwarding algorithms used by the router have usually been implemented with packet switching software that typically executes on a general purpose Central Processing Unit (CPU). However, as higher throughput has resulted from the rapid growth in IP datagram traffic, special purpose hardware has become increasingly common with a resulting decrease in flexibility to implement algorithmic changes.
Therefore, it is desirable to provide a system which restores algorithmic flexibility.
FIG. 1 shows an example of a router having forwarding engines at network interfaces. The router 10 consists of a plurality of network interfaces. 12 that are interconnected through a switching fabric 14 that is controlled through a network processor 1G. Each interface 12 may be provisioned with a forwarding engine 18 that processes incoming packets in order to determine which outgoing network interface 12 needs to be accessed through the interconnect structure of the switching fabric 14. By locating the forwarding algorithms at the interface 12, the forwarding engine 18 is required to operate at the wire rate defined by the associated transmission interface. To allow for processing suitable for transmission facilities having wire speeds of OC-12, OC-48, and OC-192, a local version of the forwarding algorithms and data tables found normally in the network processor are downloaded from the network processor 16 to the forwarding engine 18 of the network interface 12. The forwarding engine 18 is optimized for forwarding speed whereas the routing table data management operates at a lower rate of changes occurring to the network configuration. Since the network processor 16 is no longer required to make routing decisions for a data packet, the packet throughput is capable of being scaled with the number of physical interfaces 12. The single point of congestion as formerly experienced with the centralization of the routing algorithms is replaced by multiple processors 12.
Since the forwarding engine 18 interfaces the external trunks to the interconnection structure of the switching fabric 14, two sets of algorithms each serving the respective interface category are required.
FIG. 2 shows an example of the forwarding engine 18. With the forwarding engine 18 residing between the network interface, e.g. a Synchronous Optical Network (SONET) network interface 20 and the switching fabric interface 22, the computing resource that provides the forwarding function for IP requires the flexibility to deal with algorithmic tasks arising from either interface. These tasks require equable sharing of resources amongst input/output (I/O) units 26 in order to process a packet subjected to integrity checks, address lookup, route selection, option processing, flow classification, scheduling, congestion control, and performance monitoring. A simplified representation is illustrated in FIG. 2 where the forwarding program consists of a main program which provides a sequence of instructional calls and decision branches that describe the high level flow of tasks that are required to be executed; and a suite of xe2x80x9cprocedural callsxe2x80x9d that are used by the main program to execute specific algorithms including, e.g. translation, classification and scheduling procedures. The forwarding program runs exclusively on a single central processor unit (CPU) 24 with the packet throughput being limited to the computational throughput of a single processor.
To increase the processing throughput, additional processors are added to a single processor forwarding engine, so as to create a multiprocessor array that uses either asymmetrical or symmetrical program partitioning.
FIG. 3 shows an example of an asymmetrical multiprocessing system 30 using three main processors 32 and two I/O units 34. The asymmetrical multiprocessing system 30 assigns each main processor 32 a particular task, for example, translation processing, classification or scheduling processing. The duration of each task statistically varies as the respective database is searched, consequently, if the load profile for any processor is not well understood, the main processors 32 are typically not uniformly loaded. The technique has been successfully used in workstations whereby one main processor performs I/O processing, another main processor performs display processing, while a third main processor is used for application level programs. The programming complexity is relatively straight forward as each processor has a cohesive set of tasks with simple communications protocols to transfer information to and from the central processor. Consequently standard software development methodology is quite suitable. However, it still suffers from load locality problems among the main processors since it is difficult to assign loads equably among the main processors.
The alternative approach is to use multiple processors without attempting to create a cohesive subdivision of responsibility. FIG. 4 shows an example of a symmetrical multiprocessing system 40 using three main processors 42. In this system, each processor 42 is loaded with the full suite of algorithms, e.g. translation, classification and scheduling processing. Thus, each processor 42 is equally capable of executing any or all functions that are required. The challenge is to determine how the group of processors can equably allocate the processing tasks amongst themselves. Usually the volume of communication required between processors in subdividing and passing results creates communication congestion that prevents the processing capacity from scaling. Typically only 5% to 10% of the processors within an array are effectively increasing the computational capacity beyond a single processor. The level of computational and communication complexity experienced directly translates into software development complexity that requires both the software developer and the compilers to understand parallelism within the application program. Consequently most parallel processing computers have associated high yearly maintenance costs due to the need to have highly qualified programmers available to adapt an application program to the architecture of the computing system. Symmetrical multiprocessing is limited by system level problems that have not successfully resolved the issues of interprocessor communication bandwidth, the lack of data locality, and lack of design methodology.
The issue of data locality has partially been overcome by using increased bandwidth and distributed shared memory in order to eliminate congestion and to provide global access to data. Both of these factors have been addressed by the IEEE standards: P1596-1992, Scalar Coherent Interface; and P1394.2, Serial Express.
The Serial Express standard has evolved from the Scalar Coherent Interface requirements to become an effective interconnect technology that allows the computing network to scale upwards both physically and logically while extending the CPU bus domain to the I/O system. High bandwidth and low latency performance provides for scalability when interconnecting processors whether organized as a ring based grouping or as a switch based clustering of computers. The Serial Express specification IEEE P1394.2 defines a comprehensive set of signals and protocols for high bandwidth, low latency shared memory access that is specifically designed for multiprocessor and I/O attachments. The technology is capable of supporting an infrastructure that provides for multi-gigabit per second connections capable of achieving a higher bandwidth capacity than would normally be associated with a ring topology. The bandwidth expansion is achieved through the ability to transfer different messages on those portions of the ring where collisions would not occur thereby resulting in spatial reuse. With spatial reuse the Serial Express standard has increased the available system bandwidth by a factor of four when compared to the Scalar Coherent Interface use of unidirectional ring topology. FIG. 5 shows an example of the Serial Express in a Ringlet structure 50 having four ring segments 52 using Serial Express interconnect protocols. FIG. 6 shows an example of Serial Express network topologies 60 having a Ringlet 62 and a Cluster 64 which is bridged to the Ringlet 62.
However, neither standard provides a solution to achieving high processor utilization.
It is therefore desirable to provide a system which better overcome the issues of data locality and low processor utilization.
An object of the present invention is to provide a data packet processing system and method for a router which achieves high processor utilization and has algorithmic flexibility and less data locality problems.
To this end, the present invention uses a data packet processing system for a router having a multiprocessor architecture comprising a master node and a processor array of multiple slave nodes. A packet switching software of the router is partitioned into a main forwarding program which is loaded in the master node, and a set of procedures which is loaded into the slave nodes. The system assigns to each data packet a program counter which defines a sequence of procedural calls in the main forwarding program. By stepping through the program counter in the master node, each procedural call is forwarded to and executed by one of the slave nodes.
In accordance with an aspect of the present invention, there is provided a data packet processing system for a router which uses a packet switching software for routing data packets between data networks. The data packet processing system comprises a master node and a processor array of multiple slave nodes. The master node has a memory for storing a main forwarding program of the packet switching software, and an input/output unit for receiving and transmitting data packets. A processing unit is provided in the master node for assigning a program counter to each data packet when the data packet is received, so that the program counter defines a sequence of procedural calls in the main forwarding program. A transmitter and a receiver are also provided in the master node for transmitting each procedural call to one of the slave nodes in accordance with the program counter, and for receiving responses from the slave nodes. Each slave node has a memory for storing a set of procedures of the packet switching software, and a receiver for receiving procedural calls destined to the slave node. Each slave node is also provided with a processing unit for executing the received procedural calls using the set of procedures loaded in the slave node memory to generate a response to each received procedural call. A transmitter is provided in each slave node for returning the responses to the master node.
In accordance with another aspect of the present invention, there is provided a method for data packet processing for a router which uses a packet switching software for routing data packets between data networks. The method starts by providing a multiprocessor system comprising a master node and a processor array of multiple slave nodes in the router. The packet switching software is partitioned into a main forwarding program and a set of procedures, and the main forwarding program is stored in the master node and the set of procedures is loaded in each slave node. The master node receives data packets, and assigns a program counter to each data packet when the data packet is received so that the program counter defines a sequence of procedural calls in the main forwarding program. Each procedural call is forwarded to one of the slave nodes in accordance with the program counter. The one of the slave nodes receives and executes the procedural call using the set of procedures loaded therein to generate a response, which is returned to the master node. The above steps for processing the program counter is repeated until the program counter reaches its end for each data packet.
The invention will be further understood from the following description with reference to the drawings in which:
FIG. 1 is a block diagram showing a router with forwarding engines at network interfaces;
FIG. 2 is a block diagram showing a forwarding engine having a single processor;
FIG. 3 is a block diagram showing an asymmetrical multiprocessing system;
FIG. 4 is a block diagram showing a symmetrical multiprocessing system;
FIG. 5 is a block diagram showing Serial Express in a Ringlet;
FIG. 6 is a block diagram showing Serial Express Network Topologies;
FIG. 7 is a block diagram showing an embodiment of a data packet processing system in accordance with the present invention;
FIG. 8 is a block diagram showing a transmission interface which may be used in a computing node shown in FIG. 7;
FIG. 9 is a block diagram showing a computing node which may be used in a computing node shown in FIG. 7;
FIG. 10 is a block diagram showing another embodiment of a data packet processing system in accordance with the present invention;
FIG. 11 is a block diagram showing operation of the data packet processing system shown in FIG. 10;
FIG. 12 is a block diagram showing operation of the data packet processing system shown in FIG. 10;
FIG. 13 is a block diagram showing operation of the data packet processing system shown in FIG. 10;
FIG. 14 is a block diagram showing operation of the data packet processing system shown in FIG. 10;
FIG. 15 is a block diagram showing operation of the data packet processing system shown in FIG. 10;
FIG. 16 is a block diagram showing operation of the data packet processing system shown in FIG. 10;
FIG. 17 is a block diagram showing operation of the data packet processing system shown in FIG. 10;
FIG. 18 is a block diagram showing operation of the data packet processing system shown in FIG. 10;
FIG. 19 is a block diagram showing operation of the data packet processing system shown in FIG. 10;
FIG. 20 is a block diagram showing operation of the data packet processing system shown in FIG. 10;
FIG. 21 is a flowchart showing the operation of the data packet processing system shown in FIG. 10;
FIG. 22 is a block diagram showing another embodiment of a data packet processing system in accordance with the present invention using a cluster topology; and
FIG. 23 is a block diagram showing another embodiment of a data packet processing system in accordance with the present invention suitably used for a wire speed at OC-48.