In order to enhance the processing throughput of microprocessors, processors may hold instructions in a series of caches. One recent form of a cache is the trace cache. Rather than storing macro-instructions as is done in other caches, the trace cache contains sequences of previously-decoded micro-operations (micro-ops) of macro-instructions. The sequence of micro-ops may be stored in a sequence of set and way locations in the trace cache called a trace, where the micro-ops at a given set and way location may be called a traceline or trace element. Then, in further cases of executing the particular macro-instruction, decoding is not necessary and the sequence of micro-ops may be accessed from the corresponding trace in the trace cache.
One example of a trace cache is disclosed by Krick, et al., in U.S. Pat. No. 6,018,786. This patent discloses how traces in a trace cache may be constructed so that each element of a trace contains pointer data to find the next address within the trace, and how each tail of a trace may contain pointer data to find the address of the head of the next trace to be executed. However, there is still room for improvement in such a trace cache implementation. When determining the address of the head of the next trace, the pointer data in the current tail must be read from the trace cache array. This may cause a time delay called a head lookup penalty. The head lookup penalty may, among other things, require several stall cycles to be placed into the pipeline. These stall cycles represent a waste of resources that could instead be performing execution of micro-ops.