1. Technical Field
The present invention relates generally to an improved data processing system. In particular, the present invention provides a method and apparatus for obtaining performance data in a data processing system. Still more particularly, the present invention provides a method and apparatus for hardware assistance to software tools in obtaining performance data in a data processing system.
2. Description of Related Art
In analyzing and enhancing performance of a data processing system and the applications executing within the data processing system, it is helpful to know which software modules within a data processing system are using system resources. Effective management and enhancement of data processing systems requires knowing how and when various system resources are being used. Performance tools are used to monitor and examine a data processing system to determine resource consumption as various software applications are executing within the data processing system. For example, a performance tool may identify the most frequently executed modules and instructions in a data processing system, or may identify those modules which allocate the largest amount of memory or perform the most I/O requests. Hardware performance tools may be built into the system or added at a later point in time.
Instruction and data address traces are invaluable for workload characterization, evaluation of new architectures, program optimizations, and performance tuning. Two major trace issues are trace collection and storage. Although some current and emerging architecture include hardware support for trace collection, trace compression in hardware is nonexistent or rudimentary. For example, one of the Advanced RISC Machines (ARM) processor cores includes a trace module for tracing the complete pipeline information, and there is an ARM emulator that compresses these traces by replacing the sequence of the same records by their repetition count.
Currently, the simplest way to reduce the size of an address trace is to replace an address with the offset from the last address of the same type, such as instruction reference, data read, or data write reference. The Packed Differential Address and Time Stamp (PDATS) algorithm takes this approach one step further. PDATS also stores address offsets between successive references of the same type, but the records in the trace of offsets can have variable lengths, specified in a one-byte record header, and an optional repetition count. The compression overhead is very small, but because the underlying structure of the executed program is not taken into account, the achieved compression is modest.
Information about the data addresses may be linked to a corresponding loop, but this approach requires two passes through the trace or code instrumentation. Another currently available approach is to link information about data addresses to an instruction block. One such technique records possible data offsets and numbers of repetitions for each memory referencing instruction in an instruction block. This technique may have very large memory requirements because information about all possible data address offsets for one load or store instruction is kept in a linked list. Hence, it is not suitable for hardware implementation. Our previous approach, stream-based compression (SBC) uses a first-in-first-out (FIFO) buffer of limited size for data address compression, but keeps information about all instruction streams in an unbounded stream table. Because the size of this table is application dependent, this algorithm is also not suitable for hardware implementation.
The size of the structures used for compression can be limited if the compression technique employs a cache-like table for storage. One such solution is implemented, but it keeps only last data address together with the corresponding memory referencing instruction, so the compression of data addresses is achieved only when the last address is repeated.
Therefore, it would be advantageous to have an improved method, apparatus, and computer instructions for compressing data in traces.