1. Field of the Invention
The invention relates to register usage in a processor, and in particular, the expansion of a register set through the use of stacks and/or queues.
2. Description of the Related Art
Computer systems typically include, amongst other things, a memory system and one or more processors and/or execution units. The memory system serves as a repository of information, while a processor reads information from the memory system, operates on the information, and stores results to the memory system.
Processors have a large number of internal registers, with the objective of providing enough registers that most program data can be supplied from this high-speed, local storage. Register usage is an important resource allocation issue for compilers. A compiler is responsible for translating a high-level-language program into code that can be efficiently executed by the processor. This requires that the compiler allocate registers to program variables to reduce the communication with the memory system. In general, the goals of register allocation and of software scheduling are at odds with one another. The register allocator wants to allocate as few registers to as much data as possible to decrease the possibility that there will not be enough registers. On the other hand, the scheduler wants to maintain as many independent computations as possible, meaning that additional registers are needed to store the intermediate results of parallel computations.
When internal registers are full, operands and results typically stored locally must be stored in the memory system. However, memory access is much slower than register-to-register operations. Computer performance can be greatly enhanced if unnecessary memory accesses can be eliminated and faster internal register operations can be utilized.
Processor speeds and parallelism continue to increase, also causing local storage requirements to increase. An efficient compiler will produce more parallel operations to keep the processor at optimum performance. However, each of these parallel operations requires storage for operands and results. Again, when internal registers are full, operands and results typically stored locally must be stored in the memory system.
A simple solution to enhance computer performance would be to add additional internal registers. Unfortunately, the number of internal registers available for local storage is often limited by the instruction set. An instruction typically includes an opcode to identify the instruction, several register identification fields for identifying registers to supply operands and store results, and occasionally an immediate value field to supply a constant value as an operand. Typically, register identification fields are limited to a small finite number of bits limiting the overall number of unique register identifiers. For example, a single 5 bit register identifier field in an instruction used to identify a specific internal register limits the architecture to a maximum of 32 internal registers. Modification of the instruction set to expand the number of bits in the register identifier field could be performed, but this solution would break backward compatibility with older versions of software. In other words, the expanded register identifier field would result in previous generation code that could not be executed on new processors. Work-arounds are available, but often involve an operating system to trap on certain conditions, introducing significant overhead in processing time and memory space.
Another possible solution would be to utilize new opcodes that identify additional internal registers. By using several values of the opcode, bits of the opcode can be utilized to identify each new register. However, opcodes are limited to a certain number of bits, limiting the total number of available values and therefore instructions. Using opcode space to address new internal registers is an undesirable solution because a large portion of the limited instruction encoding values must be used.
A solution is needed to provide additional internal registers to a processor architecture without breaking backward compatibility and without utilizing large amounts of opcode space.