It is known that computer systems (e.g., main frames, personal computers, microprocessors, etc.) may be designed to execute instructions from one or more than one instruction set. In computer systems designed to execute instructions from more than one instruction set, for example, a first instruction set might be optimized for fast execution on a target system. However, instructions from this first set might have a relatively wide format (e.g., 32 or 64 bits in width) and therefore use a relatively large amount of memory space for storage. Hence, a second instruction set could be made available that is optimized for using less memory space through the use of a narrower instruction width format (e.g., 8 or 16 bits in width). Such instructions may execute routines slower than those from the first instruction set (because more and possibly different instructions are required to carry out the same function), but the narrower format contributes to a potential reduction in overall memory space required.
Additionally, a third instruction set could be made available to provide backwards compatibility to earlier generation machines that, again, may utilize instruction width formats of differing size (e.g., older 16-bit machines). Moreover, a fourth (or more) instruction set could be made available to provide upwards compatibility to new developments in instruction sets that may also require different instruction width formats (e.g., 8-bit JAVA bytecodes). The foregoing examples, of course, are not exhaustive.
In order for a single computer system to support different instruction sets as described above, the system requires the capability to accommodate different instruction sets having potentially different instruction width formats. One way that such capability has been achieved in the past is by mapping one instruction set onto another, which allows a single decoder to be used for the different instruction width formats. Such mapping is possible, for example, where the one instruction set is a subset of the other. However, this is a significantly limiting feature because most instruction sets are not so related.
Moreover, this issue is made more complex in computer systems that simultaneously fetch a plurality of instructions for processing. Mapping may be achieved in such a system through a series of operations carried out in one or more pipeline stages (of a pipelined processor). These operations include reading a plurality of instructions from a cache memory, processing such instructions by comparing the tags of each instruction, selecting a desired instruction from the plurality (based on the tag compare) and then mapping the desired instruction. However, in such a serial mapping method, the processing of these instructions results in an increased branch penalty and/or cycle time.
Therefore, what is needed is a more efficient way of processing instructions for execution by a processor of a computer system.