Registers are employed by a processor or execution unit to store various data intended for manipulation. Registers are preferred for data manipulations over, for example, system memory in many aspects. For example, registers can typically be designated by fewer bits in instructions than locations in system memory require for addressing. In addition, registers have higher bandwidth and shorter access time than most system memories. Furthermore, registers are relatively straightforward to design and test. Thus, modern processor architectures tend to have a relatively large number of registers.
Although performance of a processor/execution unit can generally be improved by increasing the number of registers within the processor, a large number of registers can also present problems. One of these problems is register addressability. If a processor includes a large number of addressable registers, each instruction having one or more register designations would require many bits to be allocated solely for the purpose of addressing registers. For example, if a processor has 32 registers, a total of 20 bits are required to designate four registers within an instruction because five bits are needed to address all 32 registers. Thus, the maximum number of registers that can be directly accessed within a processor architecture is effectively constrained.
Indirection is a technique that has been used to access large register files. An indirection mechanism useful for extending an architecture such as the PowerPC™ processor marketed by International Business Machines Corporation, should accommodate very large register files and satisfy the following objectives:                Compatibility with the standard PowerPC™ instruction format;        Support for existing code without recompilation;        Sufficient flexibility to support loop unrolling, software pipelining, and related software techniques used to mitigate the effects of long pipeline latencies; and        Sufficient flexibility to support software techniques for maintaining appropriately large subsets of the working data set in the register file within inner loops.        
Prior art indirection mechanisms for accessing large register files fail to meet one or more of the above-mentioned objectives. These prior art indirection mechanisms include:                Itanium™—employs a technique referred to as “rotating registers” to provide indirect access to contiguous sets of registers from the upper 96 registers in register files with 128 registers. Itanium™ is useful for loop unrolling but not for taking advantage of the large register files in more general ways. (“Intel Itanium™ Architecture Software Developer's Manual”, October 2002.)        “Register Queues”—are similar in some respects to rotating registers, with apparently increased flexibility in defining and establishing access to the contiguous register sets. Because the indirect access is still constrained to be to sets of contiguous registers, there is insufficient flexibility. (Tyson et al., IEEE Trans. Computers, August 2001.)        “Register Connection”—appears to be more general and thus a more flexible mechanism for indirect access of large register files than rotating registers and register queues. However, it is limited in that, if used with the PawerPC™ architecture, only 32 registers would be accessible by the instructions issued in any particular cycle, due to the mechanism used to map register names coded in an instruction to actual physical registers in the register file. (Kiyohara et al., in Proc., 1993, ISCA.)        
Consequently, it would be desirable to provide an improved apparatus for increasing the ability of a processor to address registers.