The invention relates generally to methods and apparatus for computer register memory and more particularly to mapping registers.
Computer systems use many different types of memory for storing information. Magnetic disks, optical compact disk read-only memory (“CD-ROM”), electronic ROM, and random-access memory (“RAM”) are a few examples of types of computer memory that are relatively inexpensive, in terms of the cost per bit, and capable of storing a relatively large amount of memory. However, computer systems typically have other types of memory that might not be apparent to a casual user.
A processor chip (“processor”) might have memory integrated with the logic processor to enhance operational speed. For example, the processor chip might have one or more levels of cache memory and a register file. The register file is a set of registers that is integrated with the logic processor on the central processing unit (“CPU”) section of the processor chip.
Generally, one level of memory is fed by another. For example, an application might call input that is loaded to the cache memory from the main memory (e.g. off-chip ROM or RAM), and then to the register file. Typically, each level provides faster access than the prior level, the fastest being the register files, which can work at the full internal processor speed. Unfortunately, this speed comes at a price.
Register memory has a relatively large “footprint” compared to main memory, for example, and consumes a relatively large amount of power, so the physical size of the register file is limited. Similarly, how an application can access registers is defined by the processor hardware, and usually results in a fixed relationship between registers. Processor register architectures become fixed when they are designed. This limits software to accessing a fixed number of registers in hardware. However, it is not only the number of registers that are fixed, but the methods of accessing the registers are also fixed. This contributes to an aging of the architecture over time.
This aging can arise from the evolution of software applications and how applications access registers. The growth of modular software, increasing call-chain depths, and decreasing function size are examples of how software has been changing. At the time a processor architecture is defined, certain assumptions are made about the size and number of the registers and how they will be used, such as how many bits in an instruction string will be used to identify a register address (i.e. the “register number”). Using an N-bit register number as a direct index to a register file limits software to accessing 2N registers.
Various techniques, such as register windows, have been developed for accessing other numbers of registers, but these techniques often place adders or non-power-of-two modulo operations in the logic path for physical register index generation. This can create a critical timing path issue for future hardware implementations, or impose additional pipeline stage(s) on future hardware implementations. Similarly, register access methods that attempt to access other numbers of registers with an N-bit register number system by creating multiple register files (of up to 2N registers each), for example, split integer and floating-point register files, typically require additional instruction encodings to address the different register files. Experimenting with other register models, for example to test compilers generating software to run on future hardware implementations, with a fixed set of registers and access methods can be inefficient and difficult. Similar problems or limitations can arise when simulating or emulating register models for other processor architectures, for example, when executing software generated for a “foreign” architecture. Fixed-size register windows have been used in the SPARC™ architecture and variable-size register windows have been used in the IA-64 architecture.
One microprocessor implemented a “frame pointer” register. A fixed block of general-purpose registers was stored in a block of memory starting at a location indicated by the frame pointer, instead of putting the block of registers on-chip. The frame pointer mapped all register references to a block of memory addresses. This was a degenerate case of a register map because the frame pointer contained only one entry, instead of a separate map entry for each register. The frame pointer technique partially addressed the problems of addressing a fixed number of registers in hardware and N-bit register numbers limiting software to 2N registers, but left other problems unresolved. Unfortunately, a frame pointer technique would probably provide unacceptably poor performance in current microprocessors, where memory access is typically orders of magnitude slower than on-chip register access.
Other processor architectures, such as the scalable processor architecture (“SPARC”™) version 8 (“V8”), IA-64, IA-32, and 680X0, have addressed the problem of being limited to 2N registers when using N-bit direct index register numbers by creating separate namespaces for different types of register, for example, integer/general purpose and floating-point name spaces. Given N-bit register number fields in instructions, more than one set of 2N registers can be addressed by restricting register number fields in each instruction to only address one set of 2N registers (i.e. only one register name space at a time). However, this approach still suffers from architecture aging, not allowing applications to optimize register access, and requiring additional instruction encodings to address all the register namespaces.
The hardware technique of register re-naming, as is typically used in out-of-order processor implementations, partially addresses the problems of addressing a fixed number of registers in hardware and N-bit register numbers limiting software to 2N registers, but leaves other problems unresolved. Register re-naming is a hardware technique that is invisible to software and involves use of a register map. SPARC64™, PA-8000™, MIPS R10000™ and PENTIUM™ processors include examples of processors using hardware register re-naming.
In summary, most processor architectures have provided a fixed-size array of general-purpose registers, accessible by a single access method in instructions, namely, direct index into the register array. At least one architecture has added an additional rotating register method of accessing those registers. However, even using rotating registers leaves several problems unresolved.