1. Field of the Invention
This invention relates to apparatus and methods for recording trace data in computer systems and more particularly to apparatus and methods for conserving CPU cache resources when generating and recording trace data.
2. Description of the Related Art
Computer programs or other executables may be designed to generate and store trace data in computer memory or other storage devices. Trace data may include information about significant events that occur in the course of executing a computer program. For example, trace data may identify or include the content of memory addresses, instructions, registers, branches, exceptions, or other similar events occurring during program execution. This information is often helpful to debug or improve program code as well as to determine system behavior when a program is executing.
Although trace data is frequently written to memory, the data is typically not read unless an event such as an error occurs. Upon occurrence of an event, the trace data may be used to determine the state of the computing environment when the event occurred or what other events occurred either before or after the event of interest. Thus, trace data is updated often but seldom read. Furthermore, although the amount of trace data stored at any specific memory location is often small, trace data is often stored at many different locations in memory.
In certain situations, hardware may be used to provide a fixed number of buffers or other mechanisms for storing trace data. Each time an event occurs, trace data corresponding to the event may simply be added to previously gathered trace data in the buffer. This trace data may be periodically flushed from the buffer or other storage mechanism to a long-term storage device.
Nevertheless, a fixed number of hardware buffers may be limiting in its ability to store and process trace data. Furthermore, providing additional buffers is expensive and is not necessarily an effective way to process trace data. For example, some software may include control structures of a few hundred to a few thousand bytes in length. In a storage system or communication system, there may be thousands of these structures, and hundreds or even thousands may be active concurrently. Each structure may generate some trace or other data which is almost never read.
The trace data generated by these structures has the undesirable effect of filling the L1, L2, or even L3 cache with data that is unlikely to be read. This data must normally age out like other data in the cache. The consequence is lower L1 and L2 hit ratios and substantially reduced processor performance. Furthermore, performing these writes with cache-inhibited mechanisms is also unacceptable because standard microprocessors will perform such operations one word at a time on the external bus, thereby increasing system overhead significantly.
In view of the foregoing, what are needed are improved apparatus and methods for recording trace data in computer systems. Specifically, apparatus and methods are needed for conserving the resources of a CPU's cache when generating and recording trace data.