Memory performance studies often employ address traces generated during an execution of a program, to analyze cache behaviors and their effects on program execution time. Address traces capture the order in which memory locations are accessed during execution; however, these traces typically do not carry any direct information on the control flow in the program. On the other hand, architectural studies use instruction traces, which capture the control flow of a program, but do not contain any address traces. Machine simulators often execute or interpret the instructions to obtain the addresses of locations referenced in the program.
In general, when traces get too large, space becomes a premium for their storage. In addition, if compression and de-compression are done off-line (i.e. producing a compressed trace from a given uncompressed trace and vice versa), the space problem is further accentuated. Furthermore, compressed traces often lose the flexibility to segment the traces so that individual segments can be examined or processed concurrently.
When compression is done on memory traces, they can capture certain repeated sequences of addresses and can fold them into compact representations. But, often, the compression mechanism breaks (That is, the memory trace can not be compressed effectively at these breaks.) when the sequence is interspersed with occasional references outside the recognized pattern. These references may be due to conditionals in the program or to loops whose bodies may have a mixture of strided and non-strided references.
Traditionally, the entire program trace was compressed, making it extremely difficult to relate values in the compressed trace to the structural components (such as blocks of the program) of the program. Thus trace analysis becomes cumbersome.