A network device can be used to facilitate a flow of information through a network. For example, a network device may receive a packet of information (e.g., from another network device or from the packet's original source) and select one of a number of output ports through which the packet will be transmitted (e.g., to another network device or to the packet's ultimate destination). One goal of the network device may be to balance the load between these output ports. That is, the network device may attempt to avoid routing significantly more packets through a first port as compared to a second port (e.g., to prevent unnecessary congestion at the first port).
To achieve such a result, the network device could simply assign a packet to an output port in a “round-robin” fashion (e.g., a first packet would be assigned to a first post and a second packet would be assigned to a second post). With this approach, however, two different packets can be assigned to two different ports even both of the packets are associated with a single stream of information (e.g., a stream of associated packets traveling from the same source address to the same destination address). This can be undesirable because different packets in the same stream might experience different delays (e.g., a second packet could arrive at a destination address before a first packet), which can degrade the performance of the network.
Another approach is to apply a hash function to a portion of the packet (e.g., to the source and destination addresses) resulting in a vector R having r bits. One of K output ports associated with a network device (i.e., ports numbered 0 through K−1) are then selected using R modulo K (i.e., the remainder of R/K). For example, a network device with 12 output ports might apply a hash function and generate a six-bit R having a value of 43 for a particular packet of information. In this case, R modulo K (i.e., the remainder of 43/12) is 7, and output port number 7 is selected for that particular packet.
This approach, however, also has a number of disadvantages. For example, the logic circuits need to perform the division (e.g., dividing by 12 and determining the remainder) can occupy a relatively large area in the processing hardware, and the function can consume a significant amount of time, especially in a high speed network.
Moreover, the approach will result in a load balancing error. Consider, for example, a network device with 12 output ports that applies a hash function to generate a five-bit R (i.e., having a value between 0 and 31). In this case, assuming a randomly distributed R, the network device will be more likely to select some ports as compared to others. In particular, ports 0 through 8 can each be selected by three unique R values (e.g., port 2 will be selected if R equals 2, 14, or 26) while ports 9 through 11 can only be selected by two unique R values (e.g., port 10 will only be selected if R equals 10 or 22). In general, the percent of load balancing error will be 100/(2r div K), where “div” represents an integer division operation.