1. Field of the Invention
The present invention relates to floating point arithmetic units having a rounding processing function, and particularly to a structure of a floating point arithmetic unit for reducing the amount of hardware, and circuitry required therefor.
2. Description of the Background Art
A floating point arithmetic operation having a rounding processing function must handle two kinds of processing, processing of rounding and processing of lost-significant bit caused by subtraction after an arithmetic operation, for the following reason. In an operation result from which significant bits are lost, at least three zeroes are arranged in sequence from the most significant bit, so that zeroes are arranged in the lower bits when the operation result is normalized (shifted to the left). Accordingly, in the case of lost-significant bit, rounding-down is performed irrespective of a rounding processing mode, whereby the rounding processing and the lost-significant bit processing do not conflict with each other. It is thus necessary, in a conventional floating point arithmetic unit, to arrange respective processing systems independently.
FIG. 6 is a schematic diagram showing an operational processing block for a mantissa part in such a floating point arithmetic unit.
Here, an operation of a double format of the IEEE standard for floating point arithmetic, which is a representative standard of a floating point arithmetic processor, will be described as one example. In the IEEE standard, the significant bit width of a mantissa part is determined to be 54 bits.
The floating point arithmetic unit shown in FIG. 6 includes an arithmetic operation portion 13, a rounding processing portion A, and a lost-significant bit processing portion B. Arithmetic operation portion 13 performs arithmetic operation for a mantissa part of floating point data. The arithmetic operation result is transmitted through a bus 21 to rounding processing portion A and lost-significant bit processing portion B.
Rounding processing portion A includes a 54.times.3 normalization shifter 16, an incrementer 17, a 54.times.2 normalization shifter 18, rounded signal/lost-significant-bit signal determination logic 19 determining which of a rounding processing mode and a lost-significant bit processing mode is designated, and a selector 20. 54.times.3 normalization shifter 16 shifts an operation result of arithmetic operation portion 13 to the left to normalize the same. Specifically, 54'3 normalization shifter 16 shifts the operation result so that "1" is located on the most significant bit of its mantissa part, to generate a round-down signal. Incrementer 17 adds "1" to the least significant bit of the operation result normalized by 54.times.3 normalization shifter 16 to generate a round-up signal. 54.times.2 normalization shifter 18 normalizes the round-up signal from incrementer 17, without any modification or shifting by one bit. The operation result normalized by 54.times.2 normalization shifter 18 is applied to selector 20 through a data bus 24. Rounded signal/loss-significant bit determination logic 19 provides to selector 20 a control signal for selecting the round-up signal or the round-down signal applied through a bus 23 or bus 24 in the rounding processing mode. Selector 20 having its input portion connected to buses 22-24 selects one of the buses in response to the control signal from rounded signal/lost-significant bit determination logic 19.
Lost-significant bit processing portion B includes a priority encoder operating on a single bit basis and a 54.times.54 bit normalization shifter 15. Priority encoder 14 counts the number of zeroes of the operation result from arithmetic operation portion 13, in sequence from the most significant bit. 54.times.54 bit normalization shifter 15 shifts the operation result of arithmetic operation portion 13 to the left to normalize the same in response to the counted result of priority encoder 14.
In the conventional operation processing block shown in FIG. 6, algorithm described by David A. Patterson and John L. Hennessy in Computer Architecture: A Quantitative Approach pp. A18-19 is replaced by a hardware, which is extracted mainly with respect to rounding operation processing of a mantissa part.
The operation of the operation processing block shown in FIG. 6 will now be described with respect to rounding processing and lost-significant bit processing. In rounding processing, an operation result of arithmetic operation portion 13, in which overflow or 1-bit underflow might occur, is initially normalized by 54.times.3 normalization shifter 16, and thereafter, is subjected to rounding processing. The data after the normalization is taken as a round-down signal, and a round-up signal is calculated by incrementer 17. Incrementer 17 constituted of a half adder adds "1" to the least significant bit of the input data to provide the result. 54.times.2 normalization shifter 18 carries out normalization again for the case of overflow which might be caused by the round-up operation, to obtain a round-up signal. Finally, either of the output signal of 54.times.3 normalization shifter 16 corresponding to a round-down signal, and the round-up signal provided from 54.times.2 normalization shifter 18 is selected, so that the rounding processing is implemented.
In lost-significant bit processing, a path for lost-significant bit processing is completely independent of that for the aforementioned rounding processing. Selector 20 in the final stage selects a signal by discriminating between the lost-significant bit processing and the rounding processing. In this processing, priority encoder 14 counts the number of zeroes in the upper bits of the operation result of arithmetic operation portion 13, and 54.times.54 normalization shifter 15 in the next stage shifts the operation result of arithmetic operation portion 13 by the counted number of zeroes. As a result, "1" is positioned at the most significant bit of the mantissa part in the operation result of arithmetic operation portion 13.
As described above, three normalization shifters 15, 16 and 18 for 54.times.59 (=54.times.54+54.times.3+54.times.2) bits, and priority encoder 14 for counting zeroes on a single bit basis are required in the conventional structure.
FIG. 7 is schematic diagram showing in detail the circuit of priority encoder 14 shown in FIG. 6. A circuit having the same structure as in priority encoder 14 shown in FIG. 6 is described, for example, in U.S. Pat. No. 4,785,421, Nov. 15. 1988, Sheet 1, FIG. 2. In FIG. 7, priority encoder 14 includes encoder circuits C0-C53 corresponding to 54 bits of the mantissa part, and connected in series. Encoder circuit C53 includes an inverter 30 receiving input data 153 of the most significant bit (MSB), NMOS transistors 28 and 29, and an AND gate 27 providing the encoded result of the MSB. Inverter 30 has its output node connected to the gate electrode of NMOS transistor 28. NMOS transistor 28 has one electrode connected to a power supply terminal 31, and the other electrode connected to one electrode of NMOS transistor 28 of encoder circuit C52 in the next stage. NMOS transistor 29 has its gate electrode connected so as to receive the input data 153, one electrode connected to a ground potential, and the other electrode connected to one electrode of NMOS transistor 28 of encoder circuit C52 in the next stage. AND gate 27 has one input node connected to power supply terminal 31, and the other input node connected so as to receive the input data 153. Encoder circuit C0 includes an AND gate 33 receiving input data I0 of the least significant bit (LSB) and input data I1 from encoder circuit C1. Respective encoder circuits C1-C52 have the same structure as that of encoder circuit C53, and connected so as to receive a signal from the preceding stage instead of a signal from power supply terminal 31. That is, encoder circuits C1-C53 have respective NMOS transistors 28 connected in series.
In operation, when the input data I53 of the MSB is 0, NMOS transistor 28 is turned on, and NMOS transistor 29 is turned off. Power supply potential 31 is applied to an input node of AND gate 27, so that output node 053 attains a high level. When the input data 153 is 1, NMOS transistor 28 is turned off, and NMOS transistor 29 is turned on. Output node 053 attains a low level, as well as an output node 052 of the adjacent encoder circuit C52 attaining a low level. A signal at a high level is propagated through NMOS transistor 28 to the encoder circuit in the next stage.
Priority encoder circuit 14 is thus structured and operated, thereby providing sequentially outputs of a high level in the case of sequential input of "0"s from the most significant bit, and providing outputs of a low level from the lower encoder circuits in the case of input of "1"s.
As described above, in the conventional operation block for a mantissa part, overflow and underflow accompanied by operation processing make a hardware structure of the rounding processing portion complicated, causing increase in the amount of hardware in a normalization shifter. In order to solve this problem, it can be considered to reduce the number of priority encoders and normalization shifters, divide the operation result of the arithmetic operation portion into a plurality of groups of bits, and process the operation result for each group. In this case, however, the problem arises that the number of operations increases, resulting in decrease of operation processing speed.