As is known in the art, data forwarding devices, such as routers, process incoming packets at relatively high lines rates, e.g., OC-192 (10 Gbps). Data forwarding devices can include network processors, such as the multi-core, single die IXP 1200 network processor by Intel Corporation, for example. In network processors having multiple processing elements, header information for a received packet is sent to a processing thread that classifies the packet and modifies the network state according to various algorithms. These algorithms process data structures that are shared by packets in the same flow. However, the shared data structures should be accessed in the packet arrival order. It can be difficult to efficiently transfer control and data to the next thread processing a packet belonging to the same flow. For example, a network processor may include sixteen processing elements that must exchange control and/or data.