The present invention relates to data processors and, more particularly, to register mapping schemes for data processors. A major objective of the present invention is to enhance data processor performance through improved register management.
Much of modern technological progress is associated with the increasing functionality and speed of modern data processors. Data processors perform calculations on numerical data in accordance with program instructions. The instructions, the data on which the instructions are to operate, and the intermediate and final results of the operations are stored in registers within the data processor.
Registers are the fastest memory available to a data processor. Calculation throughput is enhanced every time required data is fetched from a register rather than from slower memories. Instructions and series of instructions which use several data values iteratively can tax the capacity of a data processor's registers. Once register capacity is exceeded, execution is delayed while some values are transferred out of the registers to cache or main memory. Execution is further delayed when those values must be returned to the registers.
To minimize external memory operations, large numbers of registers can be used. Prior constraints on the numbers of registers, most notably the scarcity of integrated circuit area, have been ameliorated by advances in semiconductor processing technologies that permit more functionality per unit area. In addition, the trend toward reduced instruction set computing (RISC) has relaxed competition for integrated circuit area by reducing the area required for decoding and executing instructions. Several RISC architectures are described in the textbook VLSI RISC Architecture and Organization by Stephen B. Furber, Marcel Dekker, Inc., New York, N.Y., 1989.
While constraints on the number of registers have been ameliorated, there remain constraints on the number of registers that can be addressed at a time. Larger numbers of registers to be addressed require more instruction bits to be reserved for addressing. This reduces the space in an instruction available for other purposes. Although it is possible to increase instruction size to hold additional address bits, this change adversely affects program length and execution time. The effect of a longer address on a program is amplified by the large number of instructions that require register addresses.
Several schemes for managing registers have been developed so that a large number of registers can be accessed using a short address code. The simplest of these management schemes is bank switching. Registers are arranged in two or more mutually exclusive banks. Only one bank is available for addressing at a time. Infrequent bank switching instructions switch the bank to be accessed by addressing. The values in the prior bank are preserved and can be retrieved by switching back to the prior bank. Thus, ready access can be provided to more registers than can be addressed at one time.
Bank switching faces a problem when some of the values in the first bank need to be accessed at the same time as some values in the second bank. In these situations, data must be copied or moved to the second bank. Such data transfers require careful data tracking by the programmer and consume considerable execution time. In some specialized cases, sections of a bank can be switched independently. However, the situations in which this reduces the need for data transfers between banks are very limited.
Windowing is an alternative to banking. The main difference between a window and a bank is that successive windows can overlap, while banks typically do not. In a windowing scheme, the registers are serialized and the window is located using a window pointer. If 4-bit addressing is used, 16 registers can be addressed. If the window pointer is incremented by one, the second window can share 15 registers with the first window. If the window is incremented by 8, 8 registers are shared. If the window is incremented by 15, 1 register is shared. If the window is incremented by 16 or more, no registers are shared.
The overlap flexibility provided by windowing greatly reduces the need to transfer data between groups of registers. However, when a window is moved, the ordinal position of each register remaining in the window changes. Accordingly, the register addresses change, and such changes can require tracking. Furthermore, to accommodate instructions that expect certain data to be at certain addresses, some intrawindow swapping can be required. Once again, this swapping impairs execution throughput.
Another scheme that enables a large number of registers to be mapped using a small address is register renaming. Register renaming provides for a group of registers, each of which can be given any register address (name). This technique requires an additional set of map registers to associate an address with each register, greatly increasing the amount of hardware needed. Since the registers are not dedicated to a particular address, the tracking required can become quite difficult. The amount of time required for a program to manage the mapping easily negates the speed advantages of having the additional registers. Thus, the map registers are typically only accessible to the machine, and this scheme is typically only used for recovery from exceptions.
Thus, available register management schemes frequently require movement of data between registers and complicated tracking, degrading data processor performance. What is needed is a data processor that incorporates a register management scheme that is flexible enough to permit register data to be combined in various ways while inter-register transfers and tracking requirements are minimized.