1. Field of the Invention
The present invention relates to a floating point operation method and apparatus for in parallel performing addition/subtraction operation and rounding off in four rounding-off methods which are all IEEE's standards.
2. Description of the Prior Art
Generally, a floating point arithmetic unit(FPU) is necessarily used in graphic accelerators, DSP(Digital Signal Processor) and computer systems with high performance. As chip integration capability increases due to advances in semiconductor technology, it has become possible for the floating point arithmetic unit (FPU) to be put on a single chip together with the CPU, allowing the FPU to exceed its original supplementary function and is now the principal element of the main arithmetic unit. In case that the floating point arithmetic unit is built on a single chip, only some primary arithmetic units such as an adder, subtractor and multiplier are built on the chip due to the limited space of the CPU, and additional software is used for further operation. Therefore, the conventional floating point addition/subtraction operation greatly influences the entire operation of the floating point.
Among the four steps of exponent alignment, addition/subtraction operation, normalization and rounding-off, the rounding-off can be executed by four methods according to the IEEE's standard No. 754-1985: Round to Nearest, Round to Zero, Round to Positive Infinity and Round to Negative Infinity see:"Standard for Binary Floating-Point Arithmetic," ANSI/IEEE Std. 754-1985, The Institute of Electrical and Electronic Engineers, Inc., New York, N.Y., 1985, which is hereby incorporated herein by reference thereto.
In an expression of the floating point number, there are two types of 32 bits single precision shown in FIG. 1 and 64 bits double precision shown in FIG. 2.
The single precision type consists of a sign bit s of 1 bit, an exponent e of 8 bits and a fraction f of 23 bits. The double precision type consists of a sign bit s of 1 bit, an exponent e of 11 bits and a fraction of 52 bits.
An arithmetic unit according to the IEEE's standard is expressed as follows . EQU A=(-1).sup.s *1.f*2.sup.e-bias
Where s denotes a sign bit for a fraction f, f denotes a fraction expressed in an absolute value, and e denotes an exponent expressed in a bias.
The operations of floating point use a bias conception in order to simplify the exponent process.
The biased exponent is the sum of the exponent and a constant(bias) chosen to make the biased exponent's range nonnegative(hereinafter, the term "Exponent" refers to a biased exponent).
The bias in the single precision is 127. The bias in the double precision is 1023. The normalized fractions means that the MSB(Most Significant Bit) is 1 and in an expression of floating point number, here the MSB can be omitted.
For a proper rounding-off under the IEEE's standard, there should be presented an information of the data loss of the fraction f in an alignment of the exponent.
For the sake of Rounding-off, there are defined three types of bits: Guard bit, Round bit and Sticky bit.
The Guard bit G becomes the MSB of the information which will be lost, and has a weight value less than that of the LSB. The Round bit is the bit of lost information next to the Guard bit. The Sticky bit is a logically ORed operation value for the lost information bits excluding the Guard bit G and the Round bit R.
The Guard bit G is for determining whether the lost information is less than, greater than or equal to a half. As a result of subtraction of the fraction, if the MSB is zero, one shift is executed to the MSB. At this time, as the Guard bit G shifts to the left, it becomes the LSB.
Accordingly, to perform a rounding-off bits are needed for executing the role of the Guard bit G after the shift to the left by one bit with (i.e., the the Round bit R is needed).
Thus, when a subtraction results in the MSB being zero, a one bit shift is executed to the left, at this time, the Guard bit G becomes the LSB and the Round bit R becomes the Guard bit G, thus executing rounding-off. This process is referred to as increment.
When round-off with only the Guard bit G and the Round bit R cannot be performed, information whose weight is less than (i.e. to the right of that of the Round bit R is needed (i.e. the sticky bit is needed). For example, when the Guard bit G is 1, and the Round bit is zero, at this time, the increment and truncation have an equal error.
When the Guard bit is 1, the Round bit is 0 and the Sticky bit Sy is 1, the increment is selected rather than the truncation because at least one bit is "1" in the truncated fraction f.
FIG. 3, as described above, shows that the normalized operand in the IEEE's standard for single precision when the Guard bit G, the Round bit R and the Sticky bit Sy are shifted to the right by 8 bits.
The Round to Nearest is a type of round off for decreasing errors, and in case of the object of the rounding off being same to both directions, it is the method of rounding off the LSB to zero. Among the four methods, it is the best to use.
The Round to Zero is the method of rounding off to zero, and when rounding off it is truncation performed despite the value of the Guard bit G, the Round bit R and the Sticky bit Sy.
The Round to Positive Infinity is the method of rounding off to positive infinity and to make an increment when (1) at least one bit is 1 among the Guard bit G, the Round bit R and the Sticky bit By and (2) the number to be rounded off is positive.
The Round to Negative Infinity is the method of rounding off to a negative infinity and to round off when (1) at least one bit is 1 among the Guard bit G, the Round bit R and (2) the Sticky bit Sy and the number to be rounded off is negative.
As described above, after the exponent alignment, the addition/subtraction of the fraction, and the normalization at the floating point addition/subtraction operation, the rounding-off result of the four rounding-off methods for the LSB, G, R, and Sy are shown in the following tables.
______________________________________ 1. Round to Nearest LSB Guard bit Round bit Sticky bit Round-off Result ______________________________________ 0 0 0 0 Truncation 0 0 0 1 Truncation 0 0 1 0 Truncation 0 0 1 1 Truncation 0 1 0 0 Truncation 0 1 0 1 Increment 0 1 1 0 Increment 0 1 1 1 Increment 1 0 0 0 Truncation 1 0 0 1 Truncation 1 0 1 0 Truncation 1 0 1 1 Truncation 1 1 0 0 Increment 1 1 0 1 Increment 1 1 1 0 Increment 1 1 1 1 Increment ______________________________________
______________________________________ 2. Round to Zero Guard bit Round bit Sticky bit Round off result ______________________________________ 0 0 0 Truncation 0 0 1 Truncation 0 1 0 Truncation 0 1 1 Truncation 1 0 0 Truncation 1 0 1 Truncation 1 1 0 Truncation 1 1 1 Truncation ______________________________________
______________________________________ 3. Round to Positive Infinity Sign Guard bit Round bit Sticky bit Round-off Result ______________________________________ 0 0 0 0 Truncation 0 0 0 1 Increment 0 0 1 0 Increment 0 0 1 1 Increment 0 1 0 0 Increment 0 1 0 1 Increment 0 1 1 0 Increment 0 1 1 1 Increment 1 0 0 0 Truncation 1 0 0 0 Truncation 1 0 1 0 Truncation 1 0 1 1 Truncation 1 1 0 0 Truncation 1 1 0 1 Truncation 1 1 1 0 Truncation 1 1 1 1 Truncation ______________________________________
______________________________________ 4. Round to Negative Infinity Sign Guard bit Round bit Sticky bit Round-off Result ______________________________________ 0 0 0 0 Truncation 0 0 0 1 Truncation 0 0 1 0 Truncation 0 0 1 1 Truncation 0 1 0 0 Truncation 0 1 0 1 Truncation 0 1 1 0 Truncation 0 1 1 1 Truncation 1 0 0 0 Truncation 1 0 0 1 Increment 1 0 1 0 Increment 1 0 1 1 Increment 1 1 0 0 Increment 1 1 0 1 Increment 1 1 1 0 Increment 1 1 1 1 Increment ______________________________________
In addition, the conventional floating point addition/subtraction arithmetic unit, as described above, processes the exponent alignment, addition/subtraction operation, normalization and rounding-off.
Here, during the floating point addition/subtraction operation, the two operands are positive.
Accordingly, when the operand A is positive and the operand B is negative, the addition operation between both operands A and B is a subtraction between positive A and positive B since the operand B is negative.
To begin with, the alignment steps are:
1. compare an exponent 2 of the two operands, computing a difference and transferring the larger exponent to the next step, and
2. shift the computed difference from the fraction f of the operand having the lesser exponent e to the right.
Second, the addition/subtraction operation steps are:
3. perform the addition/subtraction operation in the fraction f. It is computed in a form of 2's complement, and
4. change into a positive one, in case the result of operation is negative. At this time, in the case of a subtraction and if a result value is negative, then after subtraction the value needs to be converted into the absolute value.
Third, the normalization steps are:
5. compute the number of leading zero from a result, and
6. execute the normalization.
In case of addition, it means that one bit is shifted to the right and 1 is added to the exponent 2 when a carry of an overflow takes place. In case of subtraction, the fraction f after an operation is shifted by a number of a leading zero to the left and the number of the leading zero is subtracted from the exponent.
Fourth, the rounding-of steps are:
7. perform a rounding-off operation with the guard bit G, the Round bit R and the Sticky bit Sy, and
8. re-normalize when an overflow occurs from a result value of the rounding-off operation. Thus, the fraction f is shifted to the right by 1 bit and the exponent increases by 1.
U.S. Pat. No. 4,896,286 as shown in FIG. 4 discloses a conventional floating point addition/subtraction arithmetic unit partially changed through a characteristic of the addition/subtraction in the fraction.
First, the characteristic of the addition/subtraction in the fraction will be explained and then the floating point addition/subtraction arithmetic unit using the characteristics will be explained, respectively.
Referring to FIG. 4, the floating point addition/subtraction operation consists of six steps: an exponent alignment, an addition/subtraction operation in the fraction, an absolute value computation, a normalization, a rounding-off and an overflow process.
In an addition of the addition/subtraction operations in the fraction, when the two exponents are the same, the result of an addition operation is positive, so that a floating point addition operation can be executed through the exponent alignment, addition operation, normalization, rounding-off and overflow process.
In a subtraction of the addition/subtraction operations, bit G, the Round bit R and the Sticky bit Sy which are needed in a rounding-off process are all zero, so that the floating point subtraction operation will be executed by only an exponent alignment, subtraction operation in the fraction, absolute value computation and normalization.
When there is a difference in the exponent, in a structure of subtracting a smaller number from a larger number, a result value of the subtraction is positive. Therefore, when the exponents of the two operands are different from each other, the result value of an addition/subtraction operation is positive, so that the floating point addition/subtraction operation will be executed by an exponent alignment, addition/subtraction operation in the fraction, normalization, rounding-off process and overflow process.
Referring to FIG. 4, there is shown a schematic configuration of a conventional floating point arithmetic unit. The floating point addition/subtraction operation is executed through the following four steps.
For performing the four steps there is provided the data alignment circuit 10, the addition/subtraction operation circuit 11, the normalization circuit 12, and the rounding-off circuit 13 and the overflow processing circuit 14.
Here, the data alignment circuit 10 is provided for shifting the fraction of the operand having smaller exponent obtained from the computed difference by comparing the value of the biased exponent, to the right by the difference of the exponent, thus aligning them.
The addition/subtraction operation circuit 11 is provided for adding or subtracting the two operands from the data alignment circuit 10 and making the negative positive during the subtraction of the fraction.
The normalization circuit 12 includes a shifter for shifting the fraction. The shifter shifts the fraction to the right during the overflow of the fraction in a time of addition at the addition/subtraction operation circuit 11, or shifts the fraction to the left by the number of leading zero in the subtraction, and adjusts the exponent.
The rounding-off circuit 13 and the overflow circuit 14 are provided for performing the rounding-off with the LSB of the fraction, the Guard bit, the Round bit, and the Sticky bit, which are normalized through the normalization circuit 12, and for normalizing the 1 bit in case of the overflow occurred due to the rounding-off process.
The data alignment circuit 10 includes first and second registers 40 and 41, a first subtractor 42, a first multiplexer 43, a second multiplexer 44, a first shifter 45, a third multiplexer 46 and a fourth multiplexer 47.
The first and second registers 40 and 41 include the two floating point numbers having sign bits S1 and S2, exponents e1 and e2, and fractions f1 and f2. The first subtractor 42 generates a difference signal ed, a borrow signal eb and an equality signal ex by subtracting the exponent e2 of the second register 41 from the exponent e1 of the first register 40. The first multiplexer 43 selectively outputs between the exponent e1 from the first register 40 and exponent e2 from the second register 41 according to the borrow signal eb generated by the first subtractor 42. The second multiplexer 44 selectively outputs between the fraction f1 from the first register 40 and the fraction f2 from the second register 41 according to the difference signal ed generated by the first subtractor 42. The first shifter 45 shifts the fraction f1 or f2 selectively applied from the second multiplexer 44 by a difference signal ed. The third multiplexer 46 selectively outputs between the fraction f1 from the first register 40 and the output value of the first shifter 45 according to the borrow signal eb generated at the first subtractor 42. The fourth multiplexer 47 selectively outputs between an output value from the first shift 45 and the fraction f2 from the second register 41 according to the borrow signal eb generated at the first subtractor 42.
The addition/subtraction operation circuit 11 includes first and second invertors 48 and 49 and first adder 50. The first and second invertors 48 and 49 invert an output selectively inputted from the third and fourth multiplexers 46 and 47 by receiving control signals CPL1 and CPL2 generated at the main control unit 61 according to an arithmetic mode to be inputted. The first adder 50 adds a value inverted at the first and second invertors 48 and 49.
The normalization circuit 13 includes a counter 51, a second shifter 52 and a second subtractor 53.
The counter 51 counts bits for normalizing a result of the first adder 50. The second shifter 52 generates the LSB, Guard bit G, Round bit R, and Sticky bit Sy, which are needed at a rounding-off process by shifting an output value of the first adder 50 by a value counted at the counter 51.
The rounding-off circuit 13 includes a rounding-off controller 54, a third invertor 55, a fifth multiplexer 56 and a second adder 57.
The rounding-off controller 54 rounds off according to the rounding-off mode with the LSB, Guard bit G, Round bit R and Sticky bit Sy, which are obtained or generated from the shifter 52 under the control of the main control unit. The fifth multiplexer 56 selectively outputs an output value of the rounding-off controller 54 and the value 1 according to the control signal CPL3 generated at the main control circuit unit 61. The second adder 57 adds an output value of the third invertor 55 and the result value selected at the fifth multiplexer 56.
The overflow process circuit 14 includes an overflow detector 58 for detecting whether an output value of the second adder 57 is an overflow or not, a third shifter 59 for shifting a result value outputted from the second adder 57 according to a result detected at the overflow detector 58, and an incrementor 60 for increasing an output value of the second subtractor 53 according to a result detected at the overflow detector 58.
According to the conventional floating point arithmetic unit, when the exponents e1 and e2 are outputted from the first and second registers 40 and 41 in which the two floating points are stored, having sign bits s1 and s2, exponents e1 and e2, and fractions f1 and f2, the first subtractor 42 of the data alignment circuit 10 subtracts an exponent e2 of the second register 41 from the exponent e1 of the first register 40, and generates the results which are a difference signal ed, a borrow signal eb, and an equality signal ex.
The borrow signal eb generated at the first subtractor 42 becomes zero when the exponent e1 is equal to or greater than the exponent e2, and on the contrary, it becomes 1 when exponent e1 is less than exponent e2 and is inputted into the first to fourth multiplexers 43, 44, 46 and 47 and the main control circuit unit 61.
When the borrow signal eb generated at the first subtractor 43 is zero, the multiplexer 43 selectively outputs exponent e1 applied from the first register 40, and when the borrow signal eb is 1 it selects the exponent e2 applied from the second register 41 and applies it to the second subtractor 53 of the normalization circuit.
Accordingly, the first multiplexer 43 selects the larger exponent.
When the borrow signal eb generated at the first subtractor 42 is 1, the second multiplexer 44 selectively outputs the fraction f1 of the first register 40, and when the borrow signal eb is zero, it selects the fraction f2 of the second register 41 and applies it to the first shifter 45.
Accordingly, the second multiplexer 44 selects the fraction of the smaller exponent.
The first shifter 45 shifts the fraction inputted from the second multiplexer 44 to the left by a difference signal ed generated at the first subtractor 42 and applies it to the third and fourth multiplexers 46 and 47.
When the borrow signal eb generated at the first subtractor 42 is zero, the third multiplexer 46 selects the fraction f1 of the first register 40 and inputs it into the first invertor 48 of the addition/subtraction operation circuit 11, and when the borrow signal is 1, it selects the output value shifted at the first shifter 45 and inputs it into the first invertor 48.
When the borrow signal eb generated at the first subtractor 42 is zero, the fourth multiplexer 47 selects an output value obtained by shifting at the first shifter 45 and applies it to the second invertor 49 of the addition/subtraction operation circuit 11, and when the borrow signal eb is 1, it selects the fraction applied from the second register 41 and inputs it into the second invertor 49.
The first and second invertors 48 and 49 of the addition/subtraction operation circuit 11 receive the control signals CPL1 and CPL2 generated at the main control circuit unit 61, invert the values outputted at the third and fourth multiplexer 46 and 47, respectively, and then applies them to the first adder 50. The first adder 50 adds the values inverted at the first and second invertor 48 and 49, applies them to the second shifter 52 and counter 51, which are the normalization circuit 12, and then applies control signals indicating a result of the positive and negative numbers.
The counter 51 of the normalization circuit 12 counts bits for normalization from a result of the first adder 50. At this time, when the number outputted from the first adder 50 is positive, the counter 51 counts the number to be shifted to the right, and when it is negative the counter 51 counts the number to be shifted to the left.
The second shifter 52, as a unit performing the normalization, shifts as much as the result value, counted at the counter 51, from the result value of the first adder 50 and applies the LSB, Guard bit G, Round bit R and Sticky bit Sy to the rounding-off controller 54 and the fifth multiplexer 56 which are the rounding-off circuit 13.
The second subtractor 53, as a unit adjusting the exponent, subtracts as much as the result value, counted at the counter 51, from the result value of the exponent selected at the first multiplexer 43 and applies the result to the incrementor 60 of the overflow process circuit 14.
The rounding-off controller 54 of the rounding-off circuit 13 rounds off by receiving the LSB, Guard bit G, Round bit R and Sticky bit Sy, which are needed in a rounding-off at the second shifter 52 of the normalization circuit 12 according to the rounding-off mode and then applies them to the fifth multiplexer 56. The fifth multiplexer 56 selects a result value at the rounding-off controller 54 and the value 1, and then inputs them into the second adder 57 according to the control signal CPL3 applied from the main controller 61.
In addition, the third invertor 55 of the rounding-off circuit 13 performs an inversion when a value generated from the second shifter 52 according to the control signal CPL3 applied from the main control circuit unit 61 is negative and inputs them into the second adder 57.
The second adder 57 adds a result value from the third invertor 55 and a result value from the fifth multiplexer 56 and applies the weight to the third shifter 59 and the overflow detector 581 The overflow detector 58 detects whether an output value of the second adder 57 is an overflow or not, and if an overflow occurred, it controls the third shifter 59 accordingly.
When an overflow is detected at the overflow detector 58, the third shifter 59 shifts a result value from the second adder 57 by 1 bit to the right and outputs the final fraction f. At the same time, the incrementor 60 increases by one to the result value from the second subtractor 53 and outputs the final exponent 2.
The main control circuit unit 61 controls the first to third invertors 48, 49 and 55 and the fifth multiplexer 56 with the sign bits s1 and s2 of the first and second register 40 and 41, a carry signal eb of the first subtractor 42, the equality signal ez and the output value of the first adder 50.
However, the floating point addition/subtraction arithmetic unit of FIG. 4 consists of the exponent alignment(data alignment), addition/subtraction operation, normalization and rounding-off. For the rounding-off process, it needs an additional adder and furthermore a re-normalization may occur due to the rounding-off.
U.S. Pat. No. 4,562,553 as shown in FIG. 5 discloses a schematic block diagram of the conventional floating point arithmetic unit capable of rounding off without using the incrementor by a prediction of the rounding-off during the floating point addition/subtraction.
As shown therein, there are provided a memory 100, an exponent comparator 101, a first mantissa selection memory 102, a second mantissa selection memory 103, a floating point adder circuit 104, an anticipating overflow and rounding-off circuit 105, an addition register 106, a leading zero counter 107, a mantissa normalization shift register 109, a mantissa register 111, an exponent correction circuit 108 and an exponent register 110.
In an arrangement described above, the memory 100 stores the two operands. The exponent comparator 101, as a data alignment circuit, outputs the results by comparing the exponents A and B applied from the memory 100. The first mantissa selection memory 102, as a data alignment circuit, shifts the two exponents applied from the memory 100 by a difference of the two exponents obtained from the exponent comparator 101 and generates Guard bit G, Round bit R and Sticky bit Sy which judges data of the rounding-off. The second mantissa selection memory 103 selects and stores a larger exponent between the two exponents A and B applied from the memory 100 according to the result signal obtained from the exponent comparator 101. The floating point adder circuit 104 subtracts or adds the exponent value inputted from the first and second mantissa selection memories 102 and 103, and adds or subtracts the number as much as the number obtained from the anticipating overflow and rounding-off circuit 105. The anticipating overflow and rounding-off circuit 105 detects the overflow from the floating point adder circuit 104 and processes in expectancy of the rounding-off with partial signals generated therein. The addition register 106, as a circuit of addition, stores numbers from the MSB to Guard bit G among output values obtained in a computation at the floating point adder circuit 104. The leading zero counter 107 is a normalization circuit for counting the values stored at the addition register 106 and outputted according to the overflow detected at the anticipating overflow and rounding-off circuit 105. The mantissa normalization shift register 109 shifts the output value of the addition register 106 by a value counted at the leading zero counter 107. The mantissa register 111 stores the output value of the mantissa normalization shift register 109 and outputs the result value of the fraction. The exponent correction circuit 108 corrects and outputs by increasing or subtracting the exponent value by the counted value of the leading zero counter 107 according to whether an overflow is detected at the anticipating overflow and rounding-off circuit 105. The exponent register 110 stores the exponent value corrected at the exponent correction circuit 108 and outputs the final result of the exponent.
The floating point arithmetic unit of FIG. 5 compares the exponents A and B of the two operands, which are stored in the memory 100 at the exponent comparator 101 which is the data alignment circuit, applies the fraction of the smaller exponent to the first mantissa selection memory 102, applies the fraction of the larger exponent to the second mantissa selection memory 103, and applies the greater exponent to the exponent correction circuit 108.
The first mantissa selection memory 102 selects a fraction of the smaller exponent between the two fractions A and B inputted from the memory 100, shifts the selected number to the right by a difference of the two exponents inputted from the exponent comparator 101, inputs the value S.sub.0 to S.sub.31 into the floating point adder circuit 104, and generates the Guard bit G judging data of the rounding-off, the Round bit R, and the Sticky bit Sy.
The second mantissa selection memory 103 inputs the fraction of the larger exponent among the two fractions A and B inputted from the memory 100 into the floating point adder circuit 104.
At this time, the anticipating overflow and rounding-off circuit 105 determines the rounding-off and its location according to the operation result of the floating point adder circuit 104 and applies them to the floating point adder circuit 104.
The floating point adder circuit 104 adds or subtracts the fraction value S.sub.0 -S.sub.31 of the smaller fraction inputted from the first mantissa selection memory 102 and the fraction value L.sub.0 -L.sub.31 of the larger exponent inputted from the second mantissa selection memory 103, and also adds or subtracts number of rounding-off to/from the anticipating overflow and rounding-off circuit 105 and applies the result value .SIGMA..sub.0 -.SIGMA..sub.31 to the addition register 106 which is a normalization circuit.
The addition register 106 stores numbers from the MSB to Guard bit G among output values of the floating point adder circuit 104 and inputs them into the leading zero counter 107 and the mantissa normalization shift register 109.
At this time, the anticipating overflow and rounding-off circuit 105 selects the overflow from the floating point adder circuit 104, and enables the leading zero counter 107. The leading zero counter 107 counts the value outputted from the addition register 106 and inputs it into the exponent correction circuit 108 and the mantissa normalization shift register 109. The mantissa normalization shift register 109 shifts the value outputted from the addition register 106 by 1 bit according to the leading zero counter 107 and inputs it into the mantissa register 111. The exponent correction circuit 108 increases the value of the larger exponent inputted from the exponent comparator 101 by 1 and inputs it into the exponent register 110.
If an overflow did not occur from the floating point adder circuit 104, the leading zero counter 107 counts the number of leading zeros of the addition register 106. The mantissa normalization shift register 109 shifts the outputted value of the floating point adder circuit 104 to the left by the counted number and inputs it into the mantissa register 111. The exponent correction circuit 108 subtracts the value of the larger exponent of the exponent comparator 101 by the counted number and inputs the result value into the exponent register 110.
Accordingly, the value outputted from the exponent register 110 is a result value of the exponent. The value outputted from the mantissa register 111 is a result value of the mantissa.
The rounding-off process in the above floating point arithmetic unit of FIG. 5 uses the signals generated at the floating point adder circuit 104. It does not need an additional incrementor for the rounding-off process, thus reducing the number of needed gates. However, when rounding off with partial signals generated at the floating point adder circuit 104, there may occur a carry propagate delay since the addition/subtraction operation for rounding off is performed using the floating point adder circuit, which performs the addition/subtraction operation, so that there is not much difference in a view of a structure of the conventional floating point addition/subtraction arithmetic unit and the processing delay time. Also it is developed only for the Round to Near/up which is not the IEEE's standard.
U.S. Pat. No. 4,926,370 as shown in FIG. 6 discloses a schematic block diagram of the floating point arithmetic unit of concurrently performing a normalization and rounding-off after an operation of the fraction.
Referring to the figure, there are shown a normalization circuit 200, a control logic circuit 201, a selection circuit 202, a rounding-off circuit 203, and a selective output circuit 204.
The normalization circuit 200 obtains leading zeros through the zero detector 200a from the result value of the fraction inputted from the outside register after the fraction operation and shifts them as much as the leading error in the shifter 200b. The control logic circuit 201 detects the upper 2 bits from a result value of the fraction inputted from the outside register after the fraction operation, generates a normalization selection control signal NS and rounding-off selection control signal RS, and outputs the detected values 1.X or 0.1X. The selection circuit 202 shifts the result value of the fraction inputted from the outside register by 1 bit according to the value detected at the control logic circuit 201. The selective output circuit 204 selects, when the normalization selection control signals NS occur from the control logic circuit 201, the value normalized at the normalization circuit 200, when the rounding-off selection control signal RS occurs, and selects the value rounded-off at the rounding-off circuit and inputs it into the outside register.
According to the above floating point arithmetic unit of FIG. 6, the upper 2 bits are detected after the fraction operation. If the result is 1.X the rounding-off process is performed. If the result is 0.1X, the rounding-off process is performed after 1 bit is shifted to the left. If the result is 0.0X, the normalization of the result value of the fraction directly inputted is performed since it does not need the rounding-off.
As shown in FIG. 6, when the result value of the fraction is inputted from the outside register, the zero detector 200a of the normalization circuit 200 obtains leading zeros from the fraction when the upper 2 bits of the fraction are 0.0X, shifts the obtained leading zeros to the shifter 200b, and inputs the normalized value to the selection output circuit 204.
The control logic circuit detects the upper 2 bits of the fraction inputted from the outside register after the fraction operation, if its type is 0.0X, inputs the normalization selection control signal NS into the selection output circuit 204 and permits the normalization circuit 200 to select normalized value, if the upper 2 bits are 0.1X or 1.X, generates the rounding-off selection control signal RS and inputs the control signal RS into the selection output circuit 204, and inputs 0.1X and 1.X into the selection circuit 202.
The selection circuit 202, when 0.1X is inputted from the control logic circuit 201, shifts the result value of the fraction inputted from the outside register by 1 bit and inputs it into the rounding-off circuit 203. When 1.X is inputted, the selection circuit 202 inputs the result value of the fraction into the rounding-off circuit 203 without shifting.
Accordingly, the rounding-off circuit 203 rounds off the result value of the operation of the fraction inputted from the selection circuit 202 and inputs it into the selection output circuit 204. The selection output circuit 204 selects the value rounded off at the rounding-off circuit 203 according to the rounding-off selection control signal RS occurring only in case the upper two bits checked at the control logic circuit 201 are 0.1X or 1.X, and selects the value from the normalization circuit according to the selection control signal NS occurring in case the upper two bits are 0.0X. The selection circuit 204 outputs the selection to the outside register.
However, even in the above floating point arithmetic unit of FIG. 6, there may occur an overflow due to the rounding-off process and thus causing re-normalization. In addition, it can cause problems of processing without considering the types of 1X.X bits which may come from the fraction operation.