(1) Field of the Invention
The present invention relates to the field of microprocessors. More specifically, the present invention relates to register stacks in microprocessors.
(2) Description of Related Art
A register stack architecture allows multiple procedures to efficiently share a large register file, by stacking procedure call frames in registers. Registers may be used by procedures for holding intermediate results, address indexing, passing parameters between calling and called procedures such as subroutines, etc.
In most modern microprocessor architectures with xe2x80x9cnon-stacked register architecturesxe2x80x9d the overhead of saving and restoring registers on procedure calls and returns limits the performance of a microprocessor or computer system. Since the call/return patterns of typical applications exhibit high call/return frequencies with small amplitudes, the hysteresis of a stacked register file causes a significant reduction in the number of stores at procedure calls (register spills) and loads at procedure returns (register fills). As processor frequency is increasing (access to processor faster) and access time to random access memory (RAM) is decreasing, but not as much as processor frequency is increasing reducing the number of memory accesses performed by a program will result in performance improvement in most computer systems.
While register stacking removes the number of register spill/fill operations, programs with deep procedure call chains may exhaust available registersxe2x80x94procedure calls may cause register stack overflows, while returns may cause underflows. Traditional processor architectures define over/underflow traps that vector to software overflow/underflow handlers to spill or fill registers in order to make room in the register stack. However, these techniques may slow down execution of programs which may need to stop to allow the overflow/underflow handlers to do their job.
Consequently, it is desirable to provide an apparatus and a method that uses excess processor memory bandwidth to dynamically spill/fill registers from the stacked register file to a backing store in memory concurrently with program execution such that spilling or filling may operate in parallel with the processor""s execution of instructions. In such an environment it is desirable to provide a way of xe2x80x9csynchronizingxe2x80x9d spilling and filling of registers with a processor""s execution of instructions when a switch from a source to a target context is required to make possible a return to the same context and resume operation in the source context as if no context switch occurred. It is also desirable to provide a way of saving and restoring, in an efficient manner, the contents of stacked registers of the stacked register file upon interrupt and return from interrupt, respectively.
The present invention provides a processor configured to execute a programmed flow of instructions. The processor includes a register stack (RS). The register stack (RS) has a portion allocated for dirty registers. The processor also includes a register stack engine (RSE) to exchange information, in one of an instruction execution dependent and independent modes, between the RS and storage area. The processor also includes a flush control circuit to generate to the RSE, dependent of instruction execution a signal, in response to which, the RSE spills to the storage area all dirty registers.
The present invention also provides a computer implemented method in a processor. The processor includes a register stack (RS) device that includes a portion allocated for dirty registers. The portion is defined by first and second physical register numbers. The processor further includes a register stack engine (RSE) to exchange information in one of an instruction execution dependent and independent modes between a storage area and the RS. The storage area is defined by first and second pointers. At step a, it is determined whether the first and second physical register numbers have a predetermined logical relationship relative to each other. At step b, it is stored by the RSE, a register of the portion of the RS to a first location in the storage area corresponding to the first pointer, if the first and second physical register numbers have the predetermined logical relationship relative to each other. At step c, a first pointer is caused to point to a next location in the storage area and the first physical register number is incremented.