A register is a high speed temporary memory device used to receive, hold, and transfer data (usually a computer word) to be operated upon by a processing unit. Registers provide multiple operands and accept multiple results during each machine cycle. The number of registers defined by a particular architecture is limited by the access time and the amount of hardware that is required to support its high speed multi-port access. The number of registers is further constrained by the number of bits in each instruction that is dedicated for use as register specifiers. Current architectures with 32-bit instructions typically contain three to four 5-bit register specifiers and provide access to 32 registers.
The dynamic flow of instructions in a computer can be thought of as a thread of control. The thread "lives" within an address space that contains instructions and data. Historically, a process has only a single thread of control in each address space. However, to support parallelism, computers are beginning to support multithreading and allow simultaneous sharing of both instructions and data amongst different threads.
In multithreading, the cost of communication between threads as well as the overhead of creating, scheduling, and terminating threads are critical issues. One system, the Tera Multithreaded Architecture system (Tera MTA), a scalable, shared memory, general purpose parallel computer, developed by the Tera Computer Company of Seattle Washington, provides 128 thread contexts in each processor that share the same computation engine. Each thread context contains it's own set of 32 registers. The processor physically contains 4096 registers that are indexed by a combination of the register specifier and the identifier of the requesting thread. Creating a new thread involves allocating one of these register sets and starting a parallel (in time) execution at the beginning of the new thread.
Register renaming is a technique to allow multiple outstanding stores to the same register. This is useful in pipelined machines where different instructions have different execution latencies and can be completed out of order. With register renaming, data dependencies can be maintained by hardware and instructions can be issued more aggressively. There are typically 20-25% more physical registers than the architecture exposes to the user. The IBM Power1 has 38 physical registers and 32 architectural registers.
In an article entitled "Register Relocation: Flexible Contexts For Multithreading" by Carl Waldspurger at al., published in 1993 in International Symposium On Computer Architecture, a simple method of providing register renaming utilizing a Register Relocation Mask (RRM) is disclosed. Instruction operands specify context-relative register numbers. The context-relative register numbers are dynamically combined with the RRM to yield an absolute register number which is used in instruction execution. However, an RRM must be considered by the users as it is not transparent. Typically an RRM restricts the address base and limits the number of registers available to the user. The available number of registers further has a power of 2 size limitation and must be initialized when required.
Therefore, there is a need to transparently optimize register set access.