A large integrated packet processing device such as a network flow processor integrated circuit may receive a packet, and store a first part of the packet (for example, the header) in a first memory on the integrated circuit, and store a second part of the packet (for example, the payload) in a second memory. Most analysis and decision-making is done on the header portion of the packet, so the second part of the packet may often times be advantageously stored in external memory. When a decision is made to output the packet from the network flow processor integrated circuit, the first part of the packet (the header) can be moved to an egress processing circuit. Similarly, the second part of the packet (the payload) may be moved from external memory to the egress processing circuit. The combined packet can then be output from the network flow processor integrated circuit. If, however, the packet is to be transmitted through the network flow processor in a faster fashion, then the payload is stored in another on-chip memory rather than in external memory. When the packet is to be output from the integrated circuit, the first and second parts of the packet are read from the on-chip memories that stored them, and the first and second parts are combined in the egress processing circuit, and are output from the integrated circuit. In other situations, it may be advantageous to store the various parts of the packet in other ways and places. Techniques and circuits are sought for facilitating the efficient receiving, splitting, storing, processing, reassembling, and outputting of such packets.