1. Field of the Invention
The present invention relates to a data processing apparatus and method for processing floating point instructions.
2. Description of the Prior Art
It is common for data processing apparatus to be required to perform various floating-point computations on data. It has been found that general purpose processors are not well suited to the performance of floating-point computations, and hence this has led to the development of specialised floating-point units (FPUs) to handle such computations.
If the operation to be performed by such an FPU can be broken down into a number of separate steps, it is common to provide a number of pipelined processing stages within the FPU. By such an approach, it is possible for a number of instructions to be executed simultaneously within the FPU. As an instruction is passed into the pipeline, the FPU will typically receive the operand data required for the instruction from a number of data registers, arid on completion of the instruction, will store the data result to a predetermined destination register.
In some situations, an instruction may have one or more operands that are specified by a result of one or more previous instructions already in the pipelined processing stages of the FPU. Such an instruction will be referred to hereafter as a dependent instruction. In such cases, it is clear that that dependent instruction cannot be executed until the result of each instruction upon which it depends is available, and this can impact on the processing speed of the FPU.
To alleviate this impact, it is known to direct (or xe2x80x9cforwardxe2x80x9d) the result to the first pipelined stage at the same time as it is written to the destination register. This avoids the need to wait until the result has been written to the destination register and to then read the data from that register again before being able to execute the dependent instruction.
More recently, an approach has been taken in IBM""s S/390 computer family where an unrounded result is forwarded back to the first stage of the pipeline along with a signal indicating whether rounding is required. This enables a further time saving, since the result data is forwarded before the final result is actually determined (i.e. before rounding has taken place), and hence less delay is incurred before execution of the dependent instruction can begin.
When executing instructions, it is possible that an exception condition may be detected by the FPU, and in such a situation it is often necessary to invoke an exception handling routine to deal with the exception, in order to ensure that an appropriate data result is placed in the destination register. Often, the presence of an exception will only become apparent when the final result is computed, and so often much of the exception determination logic is located in the final pipelined stages.
However, in such situations, the detection of an exception condition may occur later than required to stop the issue of subsequent instructions into the pipeline. Such instructions must be either completed or restarted. One known approach is to provide a significant area of memory, referred to as a state frame, in which all intermediate states in the pipeline can be stored as and when required. Thus, if an exception condition is detected, and hence the exception handling routine needs to be invoked to recover from the exception, then the state of all stages of the pipeline can be stored in the state frame, such that the pipeline can be restored to its previous state once the exception handling routine has completed the recovery process. Such an approach was employed in the Motorola MC68040 chip. This approach suffers from the drawback that an instruction, when exceptional, blocks completion of all subsequent instructions currently executing in the pipeline. Further, this technique requires significant time to process a store or load of a state frame to or from memory.
An alternative approach employed in prior art processing units involves the use of a history, or reorder, buffer. The reorder buffer typically stores the entire instruction stream of the data processing apparatus, and has logic arranged to send the instructions to the appropriate processing unit(s) of the data processing apparatus, and to subsequently place in the reorder buffer in association with the instruction the data result determined upon execution of that instruction. As each instruction reaches the bottom of the reorder buffer, it is xe2x80x9cretiredxe2x80x9d, for example by storing the data result in the appropriate destination register. If, however, an exception condition is associated with the instruction being retired, then the exception handling routine is invoked to recover from the exception. Subsequent to the recovery process, the instruction stream is restarted from the instruction immediately following the retired instruction.
It will be appreciated by those skilled in the art that both of the above known approaches increase complexity, and hardware/memory requirements of the FPU, which hence increases the cost of the FPU.
Accordingly, an alternative approach to handling exceptions involves detecting exceptions pessimistically during an early pipelined stage, i.e. determining whether an exception may (as opposed to will) occur, and referring all such detected exceptions to the exception handling routine. Whilst this will involve detecting more exceptions than actually would occur, and incurring the processing overhead involved in bouncing such exceptions to the exception handling routine, this often in practice does not have a significant impact on processing speed in applications where exceptions will only occur very rarely. Further, the use of such pessimistic determination of exceptions dramatically reduces the amount of state that needs storing prior to handling the exception, since the instruction giving rise to the exception will not proceed to the next stage of the pipeline, and accordingly no further instructions will be issued. This in turn enables reductions in the size and cost of the FPU.
Assuming an FPU is to adopt the above mentioned pessimistic determination of exceptions, it would seem necessary for the exception determination logic to have the correct operands available when determining the presence of the exception. However, this would appear to preclude the potential speed benefits available by adopting the earlier described technique of forwarding an unfinalised result to the beginning of the pipeline to enable a dependent instruction to begin to be executed.
It is an object of the present invention to provide a data processing apparatus and method which enables efficient processing of dependent instructions when employing pessimistic exception determination techniques.
Viewed from a first aspect, the present invention provides a data processing apparatus for processing floating point instructions, comprising: an execution unit comprising a plurality of pipelined stages, and being responsive to a floating point instruction to apply a floating point operation to a number of operands to produce a final result, result data being generated during a predetermined pipelined stage with further processing then being performed on the result data in one or more subsequent pipelined stages to generate the final result; exception determination logic for determining based on the operands whether an exception may occur during application of the floating point operation to the operands, and to prevent the execution unit applying the floating point operation to those operands if it is determined that an exception may occur; a forwarding path for forwarding the result data generated in the predetermined pipelined stage during processing of a first floating point instruction to a previous pipelined stage for use as an operand of a second floating point instruction; control logic for generating predetermined control data related to the forwarded result data, the exception determination logic being arranged to use at least some of the predetermined control data to compensate for differences between the forwarded result data and the final result relevant when determining whether an exception may occur when processing the second floating point instruction.
In accordance with the present invention, result data generated in a predetermined pipelined stage during processing of a first floating point instruction can be forwarded to a previous pipelined stage for use as an operand of a second floating point instruction. It should be noted that the result data that is forwarded will not necessarily be the same as the final result, since further processing of the result data is then performed in one or more subsequent pipeline stages in order to generate the final result. In accordance with the invention, control logic is provided for generating predetermined control data relating to the forwarded result data, and exception determination logic used to determine whether an exception may occur is arranged to use at least some of the predetermined control data to compensate for differences between the forwarded result data and the final result relevant when determining whether an exception may occur during processing of the second floating point instruction.
By this approach, it is possible to employ pessimistic determination of exceptions within the data processing apparatus, whilst benefiting from the potential speed benefits available by forwarding an unfinalised result to a previous pipelined stage for use as an operand of a dependent instruction.
In preferred embodiments, the exception determination logic is arranged to determine from an initial exponent value for a final result of the second floating point instruction whether an exception may occur (for example with reference to an exception detection table), and to use at least some of the predetermined control data to determine whether the conclusion reached dependent on the initial exponent value needs altering to compensate for relevant differences between the forwarded result data and the final result.
Typically in floating point operations, values are expressed by a mantissa and an exponent, and accordingly the result data will include a mantissa result value. In preferred embodiments, the control logic is arranged to receive an indication of the mantissa result value, and the predetermined control data generated by the control logic includes a mantissa signal indicating whether the mantissa is all ones.
Further, in preferred embodiments, the result data includes an exponent result value, the control logic is arranged to receive an indication of the exponent result value, and said predetermined control data generated by the control logic includes a number of threshold signals indicating whether the exponent result value is at a threshold value.
It will be appreciated that a number of threshold values may be specified. Preferably, a first threshold signal indicates whether the exponent result value is the largest value which is below the threshold for underflow detection. It should be noted that there may actually be a plurality of first threshold signals, one for each for a number of different operations and/or data types, since the largest value which is below the threshold for underflow detection may be dependent on the particular operation. Hence, for example, there may be a first threshold signal pertinent to an unlike-signed addition (USA) operation, and a further first threshold signal pertinent to a precision conversion operation between different data types.
Further, in preferred embodiments, a second threshold signal is provided indicating whether the exponent result value is the largest allowable value. Again, a number of second threshold signals may be provided, since the largest allowable value may be dependent on the operation being performed and/or the data type. For example, a second threshold signal may be provided pertinent to a like-signed addition (LSA) operation, and a further second threshold signal may be provided pertinent to a precision conversion operation between different data types.
In preferred embodiments, the predetermined control data includes a rounding signal indicating whether the result data was incremented during rounding to produce the final result the exception determination logic is arranged to determine from an initial exponent value for a final result of the second floating point instruction whether an exception may occur, and the exception detection logic is arranged to be responsive to the mantissa signal, the first threshold signal and the rounding signal, such that if the initial exponent value for the final result of the second floating point instruction is based on the forwarded result data, the exception determination logic is arranged not to generate an underflow exception if the first threshold signal indicates that the exponent result value is the largest value which is below the threshold for underflow detection, the mantissa signal indicates that the mantissa result value is all ones, and the rounding signal indicates that the result data was rounded to produce the final result. Hence, by this approach, it can be seen that the exception detection logic can effectively compensate for the difference in the exponent result value forwarded over the forwarding path and the actual exponent of the final result, to ensure that an underflow exception is not generated in situations where the final result would not actually trigger an underflow exception.
Further, in preferred embodiments, the predetermined control data includes a rounding signal indicating whether the result data was incremented during rounding to produce the final result, the exception determination logic is arranged to determine from an initial exponent value for a final result of the second floating point instruction whether an exception may occur, and the exception detection logic is arranged to be responsive to the mantissa signal, the second threshold signal and the rounding signal, such that if the initial exponent value for the final result of the second floating point instruction is based on the forwarded result data, the exception determination logic is arranged to generate an overflow exception if the second threshold signal indicates that the exponent result value is the largest allowable value, the mantissa signal indicates that the mantissa result value is all ones, and the rounding signal indicates that the result data was rounded to produce the final result. Hence, by this approach, it can be seen that the exception determination logic can again compensate for differences between the exponent result value forwarded over the forwarding path, and the actual exponent value of the final result, to ensure that an overflow exception is triggered in situations where an overflow exception would have been triggered if the exception determination logic had received the final result.
In preferred embodiments, the predetermined control data includes a flush signal indicating whether the final result was flushed to a zero value due to the result data being below a predetermined threshold, and if the initial exponent value for the final result of the second floating point instruction is based on the forwarded result data, the exception determination logic is arranged to disregard any underflow exception determined from the initial exponent value for the final result. Again, by this approach, the exception determination logic will compensate for differences between the forwarded result data and the final result in situations where the final result was flushed to zero. In preferred embodiments, flushing to zero will result in the data being flushed to a positive zero.
In addition to the predetermined control data being used by the exception determination logic, in preferred embodiments at least some of the predetermined control data is used to generate signals used within the execution unit when processing the second floating point instruction to alter the forwarded result data to compensate for differences between the forwarded result data and the final result.
For example, the predetermined control data preferably includes a rounding signal indicating whether the result data was incremented during rounding to produce the final result.
In preferred embodiments, the execution unit includes an incrementer to increment an operand input thereto, and a multiplexer associated therewith, the multiplexer being arranged to select the output of the incrementer if that operand is based on the forwarded result data and the rounding signal indicates that incrementing during rounding occurred.
Further, in preferred embodiments, it is possible that the forwarded result data may be used as the multiplier for the second floating point instruction, in which case if the rounding signal indicates that incrementing during rounding occurred, a multiplication unit in the execution unit is arranged to incorporate another instance of the multiplicand in the computation of the multiplication result. Further, it is possible that the forwarded result data may be used as a multiplicand for the second floating point instruction, in which case if the rounding signal indicates that incrementing during rounding occurred, a multiplication unit in the execution unit is arranged to employ a modified recoding table used within the multiplication unit during the computation of a multiplication result.
As another example, the predetermined control data preferably includes a flush signal indicating whether the final result was flushed to a zero value due to the result data being below a predetermined threshold.
In preferred embodiments, the execution unit includes a multiplexer having a zero value as one input and one or more other inputs based on one or more operands, the multiplexer being arranged to select the zero value as its output if the one or more operands are based on the forwarded result data and the flush signal indicates that the final result was flushed to zero.
Viewed from a second aspect, the present invention provides a method of operating a data processing apparatus to process floating point instructions, the data processing apparatus comprising an execution unit having a plurality of pipelined stages, and being responsive to a floating point instruction to apply a floating point operation to a number of operands to produce a final result, result data being generated during a predetermined pipelined stage with further processing then being performed on the result data in one or more subsequent pipelined stages to generate the final result, the method comprising the steps of: (i) determining based on the operands whether an exception may occur during application of the floating point operation to the operands; (ii) preventing the execution unit applying the floating point operation to those operands if it is determined that an exception may occur; (iii) forwarding the result data generated in the predetermined pipelined stage during processing of a first floating point instruction to a previous pipelined stage for use as an operand of a second floating point instruction; (iv) generating predetermined control data related to the forwarded result data, at least some of the predetermined control data being used in said step (i) to compensate for differences between the forwarded result data and the final result relevant when determining whether an exception may occur when processing the second floating point instruction.