The invention relates generally to the handling of load errors in computer processors, more especially but not exclusively to the handling of load errors resulting from speculative loads.
For good performance of a processor in a data processing system it is desirable to overlap data loads with other operations, by moving the load instructions forward to earlier positions in the instruction stream. When a load instruction is moved ahead of conditional control structures in the program flow, then the address it reads from may not yet be validated by the rest of the program code and may therefore be wrong. Loading of this kind is referred to as speculative.
A speculative load is thus defined a load operation that is issued by a processor before it is known whether the results of the load will be required in the flow of the program. Speculative loads can reduce the effects of load latency by improving instruction scheduling. Generally, speculative loads are generated by the compiler promoting loads to positions before test control instructions.
Speculative loads are often implemented as non-faulting loads. A non-faulting load is a load which always completes, even in the presence of faults. The semantics of a non-faulting load are the same as for any other load, except when faults occur. An example of a fault is an address-out-of-range error. When a fault occurs, it is ignored and the hardware and system software cooperate to make the load appear to complete normally, but in some way to return the result in a form which reflects that the loaded datum is invalid. Typically, the hardware will be configured to generate a fault indication for a failed normal load and to return a particular data value for a failed speculative load.
One known example of the handling of a failed load is the standard use of a poison bit or valid bit in the register into which the result of the speculative load is loaded. If the non-faulting load is successful then the poison bit remains unset or the valid bit is set. On the other hand, if the non-faulting load is unsuccessful then the poison bit is set or the valid bit remains unset. The software or hardware is then configured to ensure that any subsequent use of the data in the register generates a trap. With this approach, whenever there is an error in a non-faulting load, the program flow will enter into an error handling routine when an operation on the invalid data is attempted.
Another example of the handling of a failed load is to be found in the Sun SPARC processors UltraSPARC I and II. Here a non-faulting load returns zero-valued data when an exception (i.e. an error) is encountered. Software code then uses a compare instruction to check the load result before use, not using the speculatively loaded data if it is zero. If the result is zero, then the memory address is read again later using a normal (non-speculative) load to which normal protection mechanisms apply. The normal load will be able to differentiate between correct zero-valued data and an exception condition. Only if the normal load shows an error will an exception be caused, i.e. a trap generated. With this approach, whenever a zero result is returned from a non-faulting, speculative load, the instruction stream is stalled until the normal load has completed.
It is an aim of the invention to provide a mechanism for handling non-faulting loads which can improve program flow in the cases that non-faulting loads return invalid results.
Particular and preferred aspects of the invention are set out in the accompanying independent and dependent claims. Features of the dependent claims may be combined with those of the independent claims as appropriate and in combinations other than those explicitly set out in the claims.
According to a first aspect of the invention there is provided a processor comprising a load/store unit, a register unit comprising a set of registers and an arithmetic logic unit, the processor being of the kind in which the load/store unit has an error flag for marking as invalid a datum loaded to the load/store unit following a load which has not reliably completed and which is thus to be treated as having failed. The processor is modified by the provision of a symbolic entity transmitter operatively arranged as an output stage of the load/store unit so that a symbolic entity is loaded into a destination one of the registers or directly into the arithmetic logic unit when the error flag is set in the load/store unit following a failed load. Moreover, the arithmetic logic unit is configured to propagate the symbolic entity, when present in an operand of an operation carried out by the arithmetic logic unit, to a result of the operation, the result with symbolic entity then being conveyed either to a destination register of the register unit or the load/store unit, depending on the processor design.
In the present document, it should be noted that the term arithmetic logic unit (ALU) is used as a generic term for both integer logic units (which in the art are usually referred to as arithmetic logic units) and floating point units (FPUs).
In the case of floating-point registers in a processor conforming to IEEE 754, the symbolic entity may be a Not-a-Number (NaN) value. The symbolic entity transmitter may then take the form of a bit pattern generator interposed between the load/store unit and the register unit, and/or between the load/store unit and the ALU. The bit pattern generator is then configured and arranged to load a bit pattern of a NaN value into the load destination register or the ALU in the case of a failed load. The NaN value may be one of the large number of defined NaN values which is not used as a NaN value by the remaining hardware. Alternatively, a NaN value used by the processor for other purposes may be used. No or minimal special hardware is required in the ALU, since the ALU will automatically propagate a NaN value through arithmetic and logical operations. Moreover, no additional internal bandwidth will be required for the communication links between the load/store unit, register unit and ALU, since the failed load information is conveyed with the normal data bits.
Thus, according to a floating-point aspect of the invention, there is provided a bit pattern generator operatively arranged in an output path from the load/store unit so that a Not-a-Number value for the invalid datum is loaded into a destination one of the floating-point registers in the register unit, or directly into the ALU.
The ALU is preferably configured to propagate the Not-a-Number value as a Quiet-Not-a-Number (QNaN) value through operations carried out in the ALU. Moreover, the QNaN value is preferably testable for in a datum by a system software command code provided for that purpose. The command code may include a conversion, conditional on the test result, of the QNaN value to a Signaling-Not-a-Number (SNaN) value, so as to cause generation of a trap on subsequent use of the datum concerned. This may be especially useful in a processor supporting multiple threads of control where much of the processor execution will involve computing alternative xe2x80x9cwaysxe2x80x9d only one of which will ultimately lie on the execution path of the code. Alternatively, the command code may include a conditional branch, conditional on the test result, for immediately invoking an error handling routine for dealing with the invalid datum.
In the case of integer registers, the symbolic entity transmitter may take the form of hardware interposed between the output-side of the load/store unit and the input-side of the register unit and/or ALU so that, in the case of a failed load, the error flag set in the load/store unit is conveyed to set or unset a poison or valid bit, respectively, in the destination register or operand of the ALU. Moreover, the ALU and register unit, or ALU register unit and load/store unit, are interconnected so as to transmit and receive the poison or valid bit from each other during processor operation, and the ALU is internally configured to propagate the poison or valid bit through its operations.
Thus, according to an integer embodiment of the invention, there is provided a reduced instruction set computer (RISC) processor, the register unit of which includes a set of integer registers, the integer registers having one or more poison bits responsive to loads from the load/store unit of invalid data, or, alternatively, one or more valid bits responsive to loads from the load/store unit or ALU of valid data, the poison or valid bits thus serving to indicate the integrity of data held in the respective registers, wherein the ALU is configured to propagate poison or valid bits present in operands of the operations to the results of the operations and to return the results together with the propagated poison or valid bits to the registers or ALU. It is noted that poison or valid bits may also be used as the symbolic entities for floating point data instead of Not-a-Number values.
Similar functionality may be provided in an another integer embodiment of the invention in a complex instruction set computer (CISC) processor by configuring the ALU to receive operands including one or more poison bits responsive to loads from the load/store unit or register unit of invalid data, or, alternatively, one or more valid bits responsive to loads from the load/store unit or register unit of valid data.
When performing an operation, the ALU of a processor according to the above-described integer aspect of the invention will return a poisoned or non-valid result if one or more of the operands of the operation are poisoned or non-valid respectively. On the other hand, the ALU will, in all cases, or all but a number of special cases, return a valid or non-poisoned result if the or each operand of the operation is valid or non-poisoned respectively. The special cases where a poisoned or non-valid result will be returned even when all operands are non-poisoned or valid will be those in which the result of an operation can be predicted as being invalid merely by virtue of, first, the operation type and, second, either the value of one operand or the combination of values of two or more operands.
A software command code may be provided for testing a datum for the presence of poison or valid bits. Moreover, a branch conditional on the result of the testing may form part of the software command code execution, whereby an invalid datum can be handled by branching to an error handling routine.
According to a further aspect of the invention there is provided a method of operation of a processor comprising an instruction unit, a load/store unit, a register unit comprising a set of registers, and an ALU, the method comprising the steps of:
a) the instruction unit issuing a load request for a datum to an external storage element;
b) the load being carried out, but returning an invalid datum to the load/store unit;
c) the load/store unit setting an error flag for the invalid datum;
d) the load completing by loading a symbolic entity as at least a part of the datum into one of the register unit and the arithmetic logic unit;
e) the arithmetic logic unit carrying out an operation having the datum as an operand such that the symbolic entity associated with the invalid datum is conveyed to a result of the operation; and
f) outputting from the arithmetic logic unit the result with symbolic entity into one of the register unit and the load/store unit.
The invention, especially in some of its embodiments for floating-point operations, may be better understood by analogy with the concept of Not-a-Number (NaN) as defined by IEEE 754 (1985) which is a well-known standard for floating-point arithmetic implemented in many recently designed processors.
NaN is described in IEEE 754 and in standard literature on the programming of any processor which implements NaN. A brief summary is however now given. NaN is a symbolic entity encoded in floating-point format. The IEEE floating-point single and double formats includes a sign bit, a number of exponent bits, in the form of a biased exponent, and a number of mantissa bits conveying the fraction. The sign and mantissa collectively form the significand. Reserved values of the exponents are used to encode NaN""s. The reserved values may be any values apart from those two reserved for +/xe2x88x92 infinity. If the biased exponent is all ones (in its binary representation) and the fraction is not zero then the significand conveys a NaN.
The NaN standard applies to arithmetic operations such as add, subtract, multiply, divide and square root, as well as to various other arithmetic and logical operations, such as conversions between number formats, remaindering and rounding, and optionally copying without change of format. If one or more signaling NaN (SNaN) values are input to an operation then an exception is signalled. If one or more quiet NaN (QNaN) values are input to an operation, and no SNaN""s, then the operation signals no exception and delivers as its result a QNaN. With each exception there is typically an associated trap under software control. SNaN""s thus signal the invalid operation exception whenever they appear as operands and will result in the setting of a status flag, taking a trap or both. On the other hand, QNaN""s propagate through almost every arithmetic operation without signaling exceptions.
Now, by analogy with NaN, the invention may be thought of as the provision of a symbolic entity similar to a propagating QNAN which propagates when operations are carried out on an invalid datum resulting from a failed speculative load.
In contrast, with fore-knowledge of the present invention, the prior art use of a poison bit or valid bit in conjunction with trapping, as described above, may be thought of as analogous to the provision of a non-propagating SNaN. A QNaN-type functionality cannot be provided with the conventional design, since there is no means for generating a propagatable QNaN-like bit pattern in the load/store unit when the flag for a failed speculative load is set. Moreover, for non-floating-point operations, a conventional ALU has no means for propagating a QNaN-like entity through its operations, even if one could be generated.
An advantage achievable with some embodiments of the invention is that it becomes possible to delay testing the results of a non-faulting load, since the QNaN-like symbolic entity will propagate with the results of operations on an invalid datum, thereby keeping track of the integrity of the data. This has the benefit that program flow does not have to be slowed or otherwise disrupted by testing non-faulting load results as the loads occur, but can be deferred until some other time, for example when processor time is freely available, under the control of the program. The testing can be performed when convenient and, if the test reveals a QNaN-like entity, i.e. data corrupted by an earlier failed speculative load, then this can be dealt with, for example immediately by branching to a servicing routine, or by converting the QNaN-like entity to a SNaN-like entity so as to cause trapping on subsequent use.
Although the invention was conceived with the handling of failed speculative loads specifically in mind, it will be appreciated that the processor designs herein described are equally well suited to handling failed loads of any type. Delay of testing the results of loads of any kind may be advantageous for the same reason as described above in relation to non-faulting loads. One example of the utility of the propagatable, error indicating, symbolic entity for normal loads would be to provide hardware protection against programming errors of the kind which may result in illegal loads from external storage, e.g. a load from an address that does not exist. The invention may thus be embodied in processors that do not support non-faulting loads, and also in processors that do support non-faulting loads for handling both failed normal loads and failed speculative loads.