1. Field of the Invention
The field of the invention relates to data processing and in particular to a processing in a processing pipeline that uses register renaming.
2. Description of the Prior Art
In order to improve the power consumption and operational performance of a microelectronic system, systems have been developed that increasingly try to use reduced operational voltage levels to reduce their power consumption, and/or increased clocking frequency to increase their speed.
If the operational voltage is reduced by too much or the clocking speed is increased by too much then errors can occur, in that a signal may not reach the output of the processing stage during the required clock cycle(s) and thus, the previous value is output rather than the current value. In order to prevent a system from being over tuned by reducing the voltage and/or increasing the clocking speed by too much a processor will have optimal operating conditions that are judged to be safe in that errors are avoided.
An alternative approach is a razor-based system produced by ARM® Limited of Cambridge England. This is a system that is designed to operate at a point beyond this estimated absolute safe limit. The system has an error detection and recovery means to recover from cases where the signal does not reach the output in time. This system has a speculation region at the end of the clock cycle during which the output signal is measured to see if it is stable. Thus, provided any output signal attains its final value within this region, this will be detected, and if this final value is not the value at the end of the clock cycle, this can be determined and corrected. As it has this error detecting and correcting capability, the system does not need to put safety margins into the clock frequency and operational voltage. In effect it can tune its operational voltage and/or frequency to be in a range where errors are unlikely but may occur. A disadvantage of such a system is that as the error signal is initially metastable it needs to be clocked twice before it can be read. This may require the pipeline to be increased by two clocked stages, so that an error is detected before the erroneous instruction completes and writes data back. This is expensive in area and power consumption, requiring paths to send signals to stall the pipeline.
Another way to try to increase performance is to allow speculatively execute instructions such that rather than waiting to issue these instruction until it is known that they are to be executed and that they can complete, these instructions are issued early when it is speculated that they will need to be executed and that they will complete. Although such speculation can increase the performance of a processor, in order to be able to recover from instructions that do not complete normally, some sort of history of execution needs to be saved to enable the processor to wind back to the point at which the speculation that was incorrect started. In some processors information regarding the processing of pending instructions is stored in an exception table and when it is sure that an exception instruction will not generate an exception it can be retired from the table.
It is also known to provide processors which process instructions from an instruction set specifying an architectural set of registers using a physical set of registers that is larger than the architectural set. This is a technique that has been developed to try to avoid resource conflicts due to instructions executing out of order in the processor. In order to have compact instruction encodings most processor instruction sets have a small set of register locations that can be directly named. These are often referred to as the architecture registers and in many ARM® (registered trade mark of ARM Ltd Cambridge UK) RISC instruction sets there will be 32 architecture registers.
When instructions are processed different instructions take different amounts of time. In order to speed up execution times, processors may have multiple execution units, or may perform out of order execution. This can cause problems if the data used by these instructions is stored in a very limited register set as a value stored in one register may be overwritten before it is used by another instruction. This leads to errors. In order to address this problem it is know for some processing cores to perform processing using more registers than are specified in the instruction set. Thus, for example, a core may have 56 physical registers to process an instruction set having 32 architecture registers. This enables a core to store values in more registers than is specified by the instruction set and can enable a value needed by an instruction that takes a long time to be executed to be stored in a register not used by other neighbouring instructions. In order to be able to do this the core needs to “rename” the registers referred to in the instruction so that they refer to the physical registers in the core. In other words an architectural register referred to in the instruction is remapped onto a physical register that is actually present on the core. Details of known ways of doing this can be found in “register renaming—Wikipedia” at http://en.wikipedia.org/wiki/Register_renaming.
Renaming of the registers is generally done using a renaming table which maps registers from the architecture set of registers to registers in the physical set for a particular instruction. As the remapping is dependent on the decoded instruction being executed and the renaming occurs early in the processing a problem can arise if to an exception occurs during processing of the decoded instruction. In some processors the renaming table is referred to as the future table as it remaps decoded instructions that are yet to be processed. Thus, a register renaming core has to take special care when speculating over potential exception points such as branch instructions or memory access instructions, as if processing of the instruction creates an exception the remapping information for that decoded instruction and for subsequent decoded instructions is no longer available. In the case of such an exception or misprediction the core has to be able to recover the architectural state of its register bank.
This problem has been addressed in a number of ways in the prior art. In some cases register bank checkpoints are used and the register renaming or mapping tables are duplicated whenever an unresolved exception instruction is encountered. When the processed exception instruction is resolved as not creating an exception then the duplicated renaming table for that instruction (checkpoint) can be deleted. If the processed instruction creates an exception then the decoded instruction can be replayed as the appropriate renaming table has been stored. This duplication of register renaming table is very expensive in storage space.
It would be desirable to be able to execute instructions in a region where errors may occur without having to increase area significantly to allow recovery from errors.