For maximum efficiency, network processors may use multi-threading to process packet data. Packet data processing typically involves writes to and reads from external memory, sometimes resulting in memory latency inefficiencies. Multi-threading can be used to hide the latency of external memory references of one thread behind the execution cycles of other threads, but only when the total time of the execution cycles is at least as great as that of the external memory reference. Quite often, however, the external memory access time exceeds the total execution time of the other threads. As total execution time relative to external memory latency will decrease as network processor clock rates increase, this problem will only worsen in the future.