The present invention relates to systems and methods for load-balancing when distributing a sequence of incoming data packets emanating from a high speed communication line to a plurality of processing means (including electronic processors as well as electronic data processing units comprising a plurality of processors) e.g. for sending data packets from one processor means for transmission to another processor means, such as in a network that interconnects a multiplicity of processors.
Load balancing and hashing are two classic topics in Computer Science. Prior art channel stripping (also known as load sharing or inverse multiplexing) is frequently used in networking because of transmission, processing bottlenecks or simply because of price performance ratio. In that scheme, a Round Robin Algorithm or a Load Sharing Algorithm is used that strips the packets belonging to a stream, across multiple channels.
A major problem with stripping is that packets may be received out of order due to different delays on different channels and due to different packet sizes. Three types of solutions for reordering are known:
i) keeping each flow on only one physical link and accepting that a single flow cannot use more than this bandwidth link,
ii) reordering the received packets and accept waste of processing bandwidth, and
iii) splitting packets up into fixed transfer units.
Dynamic load balancing, on the other hand, is a the classic topics in the field of parallel processing field, dealing with three general computing entities: computations, tasks and data. Prior art approaches are related to specific algorithms for parallel/distributed processors systems. In these cases, load balancing tries to find the mapping of computations, tasks or data, to computers that results in each computer having an approximately equal amount of work in order to reduce run time and increase the Overall efficiency of a computation.
On the other hand, constant and massive increases in bandwidth demand have pushed very high speed networks into multigigabit-data-rate domain of operation. Performing real-time Layer 2 network processing according to prior art at that speed does already present a challenge for prior art technology: an ATM cell, for example, lasts 420 ns at 1 Gbit/s. The term xe2x80x9cLayerxe2x80x9d as used herein refers to the ISO reference model.
However, the transformation of the Internet into an important and ubiquitous commercial infrastructure has not only created rising bandwidth demands but also tends to significantly change consumer expectations in terms of performance, security and services.
These requirements translate into a large amount of additional network processing, as Layer 3 and Layer 4 information must be processed as well. This double trend of an increasing demand for bandwidth and an increasing demand for differentiated services (as defined by IETF standards) has put severe strains on traditional network processor architectures of currently available routers, and architecture designers are moving more and more towards pipelined and parallelized solutions in order to satisfy the processing demands imposed by very high speed networks.
Along with differentiated services support, a concept of traffic xe2x80x9cflowsxe2x80x9d has emerged within the EP community over the past few years as well. A flow is usually defined as a sequence of packets sent from a particular source to a particular (unicast or multicast) destination for which the source desires special handling by the intervening routers, cf. Deering, S.,Hinden R., xe2x80x9cInternet Protocol Version 6 (IPv6) Specificationxe2x80x9d RFC 18. Further, providing special handling, such as non-default quality of service or real-time service with a concept of flows, is accompanied by a set of new difficulties and constraints.
Accordingly, it is a primary object of the present invention to provide for a system and method capable of scaling the capacity or speed rate of a network processor by distributing the traffic flows of an aggregate high speed link onto a multiplicity of independent network processing entities.
Further objects and advantages of the invention will become apparent as this specification proceeds.
The above objects and further advantages will be achieved according to a first embodiment of the invention by a load balancing system, also termed xe2x80x9cload balancerxe2x80x9d herein for brevity, as defined in claim 1.
According to a first general embodiment, the invention provides a real-time load-balancing system for distributing a sequence of incoming data packets emanating from a high speed communication line to a plurality of processing means, each operating at a capacity that is lower than the capacity of the high speed communication line, said system comprising: a parser means capable of extracting a configurable set of classifier bits from the incoming packets for feeding into a compression means;
said compression means being capable of reducing a bit pattern of length K to a bit pattern having a length L which is a fraction of K; a pipeline block for delaying incoming packets until a load balancing decision is found, and an inverse demultiplexer for receiving a port identifier output from said compression means as selector and for directing pipelined packets to the appropriate output port.
The term xe2x80x9cprocessing meansxe2x80x9d as used herein includes processors as well as switching means or switching systems.
According to a second general embodiment, the invention provides for a method
for distributing a sequence of incoming data packets emanating from a high speed communication line to a plurality of processing means (A,B; A,B,C; A,B,C,D; A,B, . . .) operating at a capacity that is lower than the capacity of the high speed communication line, said method comprising providing: a parser means for extracting a configurable set of classifier bits from the incoming packets for feeding into a compression means; said compression means reducing a bit pattern of length K to a bit pattern having a length L which is a fraction of K;
a pipeline block for delaying incoming packets until a load balancing decision is found, and an inverse demultiplexer for receiving a port identifier output from said compression means as selector and for directing pipelined packets to the appropriate output port.
According to a preferred embodiment, the system according to the invention additionally comprises a lookup memory for outputting a port identifier based on an index computed by said compression means, wherein each entry into said lookup memory is capable of dynamic coordination to any output port of said system so as to populate a large number of consumer unit requests per lookup memory entry and to collect lookup memory locations that are no longer in use for providing flexibility to perform a load balancing by providing a timeout per memory entry.
According to a further preferred embodiment, the system according to the invention additionally comprises an extended classifier means.
Further, it is preferred for many embodiments of the invention if the system additionally comprises a feed-back loop for controlling load distribution and for allocating data packets to the data processing means. Generally, a preferred embodiment of the compression means is a nonlinear hashing function.
The invention makes it possible that the more complex tasks, e.g., route lookup, buffer management, scheduling and policing, are performed by network processors, each running at a fraction of the aggregate link rate whilexe2x80x94from an external perspectivexe2x80x94the overall system acts as a single network processor operating at the full link rate.
The invention is based on a real-time, deterministic load balancing architecture that enables the dynamic distribution of traffic on a high speed (multigigabit/s) link over multiple lower rate processing entities in a flow-preserving manner. Now, according to the art, splitting and sharing of a flow over a multiplicity of processing elements is considered to constitute a complex task which tends to be avoided if the number of flows is high. A flow or connection is characterized by a concept of state that links successive packets or cells together.
A recent study of an optical carrier or OC-3 trunk within the Internet MCI""s backbone indicates an average presence of up to 120.000 flows per second during peak hours.