Receive-Side Scaling (RSS) was introduced by Microsoft® in order to resolve the single-processor bottleneck. It allows the receive side network load from a network adapter to be shared across many processors, thereby enabling the scaling of the packet receive-processing. This is particularly beneficial with the more modern multi-core processor architectures, whether integrated or distributed. The features of RSS are described in a paper published by Microsoft® in November 2008, entitled “Receive-Side Scaling Enhancements in Windows™ Server 2008”, hereinafter referred to as the Specification, and included herein by reference. According to RSS the hardware unit is supposed to put each packet through a hash function, and the result determines, using a lookup table, which RX queue handles the packet. The hash function is fixed and fully described by Microsoft®, and it further includes a ‘secret key’. The secret key is a number chosen at random, i.e., different at each boot of the hardware. This makes the hash result unpredictable to an external entity. This secrecy is useful because, otherwise, if the hash is known to an external entity, that external entity could maliciously craft multiple different packets that are targeted to arrive at the same queue, thus flooding a single queue instead of the packets being spread over all available queues.
This RSS technique is good for end-machines, which only receive packets coming from the other side of the connection. However, when a network entity is considered, that receives both sides of a connection, e.g., a firewall, also referred to as a bump-in-the-network device, a problem arises as there is no guarantee under the RSS solution that packets that have the same tuple will be mapped to the same queue, regardless of the direction the packets came from. FIG. 1 shows a plurality of nodes, N1 110-1, N2 110-2 and N3 110-3 on one side of a network device 120 and a plurality of nodes, Ni 110-i, Nj 110-j and Nk 110-k, on the other side of the network device 120. The network device 120 comprises a RSS router 122 to which the nodes N1 110-1, N2 110-2 and N3 110-3 are connected to. The network device 120 further comprises a RSS router 126 to which the nodes Ni 100-i, Nj 110-j and Nk 110-k are connected to. Each of the RSS router 122 and RSS router 126 is implemented according to the Specification. Each of the RSS routers 122 and 126 may be an egress or ingress router as may be applicable and dynamically changing. The network device 120 further comprises three queues, Q1 124-1, Q2 124-2 and Q3 124-3. In the case, where node N1 is to communicate with node Ni, node N2 with node Nj and node N3 with node Nk, packets will be moving through the network device 120. However, due to the different operation of the RSS router 122 and the RSS router 126, though the same tuple value is provided, the hash will result in a different queue allocation. For example, a packet traveling from node N1 may be routed to Q1 while the response packet from node Ni may be routed to Q2, and so on and so forth.
This issue is further described in US patent application 20080077792, entitled “Bidirectional Receive Side Scaling” by Eric K. Mann. Mann solves the problem of mapping the tuples of both sides of a connection to the same RX queue by changing the hash function, and proposing two methods for creating a hash function that yields the same output both for (A1,A2) and for (A2,A1), for any pair of network addresses A1 and A2 that correspond, for example to N1 and Ni respectively. Mann refers to this as a commutative hash function. However, the commercially available network cards have been manufactured to meet the Microsoft® RSS specification, and most of them contain only the hash function specified by Microsoft®. While the RSS specification allows the offering of additional hash functions, in practice it is likely that existing commodity hardware supports only the Microsoft® hash function.
It would be advantageous to overcome the deficiencies of the prior art, and especially the asymmetric behavior of the RSS technique when operating a network device that is a bump-in-the-network device. It would be further advantageous if the solution avoids security breaches by not exposing the secret key. It would be further advantageous to allow the implementation on readily available hardware and further preferably without the need to change the hash functions embedded therein.