1. Field of the Invention
The present invention relates to storing instructions in a cache for access by a data processing apparatus. More particularly, this invention relates to identifying particular sequences of instructions being stored in such a cache.
2. Description of the Prior Art
A common goal in the development of data processing apparatuses is to improve performance with respect to older data processing apparatuses. There are various ways in which this can be achieved. One known technique for improving the performance of a data processing apparatus is to introduce one or more new instructions in the instruction set of the data processing apparatus, in particular a dedicated instruction motivated by the intended application of the data processing apparatus, for example an instruction for carrying out a particular mathematical function in a data processing apparatus that must repeatedly carry out that mathematical function. The new data processing apparatus must of course then be configured to understand the new instruction and respond appropriately. Whilst this is generally possible, because an instruction set will often have many “spare” encodings that are unused, this approach has the disadvantage that new software that is written using the new instruction is not “backwards-compatible” with older data processing apparatuses created before the introduction of the new instruction and therefore unable to understand it.
Conversely, whilst it may be possible to run on the new data processing apparatus software written before the introduction of the new instruction, this software will not take advantage of the new data processing apparatus' new functionality with regard to the new instruction, and hence the performance benefit will not be realised for such “legacy” software. In order to make use of the new functionality it will then be necessary to re-write the old software, substituting the new instruction where appropriate.
Thus, it would be desirable to provide a technique for improving the performance of a new data processing apparatus, without creating backward compatibility problems for new software and without having to rewrite old software to take advantage of the performance improvement.
In the x86 architecture it is known to decode CISC instructions into RISC micro-ops specific to the micro-architecture of the processor core, once the CISC instructions have been fetched from the instruction cache. Similarly the IBM PowerPC 970 creates IOPs specific to the micro-architecture form the instructions fetched from the instruction cache. Intel's Pentium-4 stores micro-ops in a level-1 cache known as a trace cache. Intel's Pentium-M uses an approach called micro-op fusion after the instruction cache, which allows a single x86 instruction that is usually broken into two micro-ops to proceed through the evaluation stages as if they are one, be dispatched separately and finally recombined at the retire stage.