As demand for network content increases, the need for greater network bandwidth to handle the demand also continues to increase. Currently, many network-based systems provide multiple processors that can process in parallel to handle the increasing demands for network bandwidth. For example, applications on network devices (such as computer systems connected over a network) can create connections among each other over which they can exchange streams of data in the form of data packets. A data packet is a unit of information transmitted as a discrete entity between devices over the network. To achieve high-speed and high-performance of data packet processing, it is common to parallelize the processing so that network devices can execute more than one thread (e.g., a separate stream of packet execution that takes place simultaneously with and independently from other processing) simultaneously on a multi-processing platform.
Multi-processing is useful when a single task takes a long time to complete and processing packets serially (e.g., one at a time) would slow down the overall packet throughput. In a multiprocessing system, packets can be queued to network groups (e.g., data structures that queue data packets that belong to the same network connection) for further processing based on some kind of algorithm. As a result, data packets that belong to a single connection (such as, for example, a Transmission Control Protocol (TCP) connection) are queued to a single network group, and thus are processed by a single network thread. Data packets that belong to different network connections may be processed by different network threads.
However, one problem associated with parallel processors is parallelism efficiency, or the performance improvement relative to the number of processors. The law of diminishing returns dictates that, as the number of processors increases, the gain in performance decreases. Therefore, keeping parallelism efficiency high has been a constant challenge in multiprocessor research.
A recent attempt to improve multiprocessor efficiency for network applications focuses on the framework of connectional parallelism. In connectional parallelism, implementations of connection-parallel stacks map operations to groups of connections and permit concurrent processing on independent connection groups, thus treating a group of connections as a unit of concurrency. In particular, each independent connection group is serviced on an independent kernel thread, and each kernel thread may be executed on any one of multiple processors.
Currently, connections are assigned to independent threads of execution either randomly or sequentially (e.g., round robin fashion). However, the sequential or random policy assignment may not efficiently utilize parallelism effectively or efficiently because network traffic is unpredictable and the resultant traffic load distribution across all available processors may be unpredictable and/or non-uniform.