An instruction tracing system (ITS) of a processor provides debug features, including a control flow trace that can log instructions that are being executed by a processor. In order to make use of such trace information, a trace decoder is employed to decode the trace output, and to map the trace events to the code that was executing on the processor. In order to simplify this process, trace packets are typically emitted in program order, so that a packet generated by a particular instruction follows any packets generated by older instructions, and precedes any generated by younger instructions. On modern, out-of-order (OoO) microarchitectures, a straight-forward method of producing packets in programmatic order is to generate the packets at retirement time. Though instructions often execute out-of-order, they still retire in-order, and hence packet generation at retirement time ensures that packets are emitted in-order. In some OoO microarchitectures, however, data accesses, such as loads and stores, may not complete or even begin until after retirement time. This complicates trace packet ordering for trace capabilities that attempt to expose information about data accesses, such as the address or data value.