1. Field of the Invention
The present invention relates to a floating point multiplier, and more particularly, to an apparatus and a method for performing rounding and addition in parallel corresponding to four modes of IEEE's standard in a floating point multiplier.
2. Discussion of the Related Art
Generally, a floating point arithmetic unit is necessarily used in graphic accelerators, digital signal processor, and computer systems with high performance. As chip integration capability increases due to advances in semiconductor technology, it has become possible for the floating point arithmetic unit to be put on a single chip together with a central processing unit (CPU), allowing the floating point arithmetic unit to exceed its original supplementary function and is now the principal element of the main arithmetic unit. In case that the floating point arithmetic unit is built on a single chip, only some primary arithmetic units such as an adder, a subtractor and a multiplier are built on the chip due to the limited space of the CPU, and additional software is used for further operation. Therefore, the conventional floating point multiplication operation greatly influences the entire operation of the floating point.
Meanwhile, a fraction portion in the floating point multiplication operation includes four steps of multiplication, addition of carry and sum produced by multiplication, normalization, and rounding. Alternatively, the fraction portion includes four steps of multiplication, addition, rounding, and normalization.
There are two types of 32 bits single precision and 64 bits double precision in IEEE's standard relating to an expression of the floating point number for performing the above steps. The single precision type consists of a sign bit of 1 bit, an exponent of 8 bits and a fraction of 23 bits. The double precision type consists of a sign bit of 1 bit, an exponent of 11 bits and a fraction of 52 bits.
An arithmetic unit according to the IEEE's standard is as follows.
&lt;Equation 1&gt; EQU A=(-1).sup.s.times.1.f.times.2.sup.e-bias
Where s denotes a sign bit for a fraction f, f denotes a fraction expressed in an absolute value, and e denotes an exponent expressed in a bias. The normalized fraction means that the most significant bit (MSB) is 1 bit and in an expression of the floating number point the MSB is hidden bit because the MSB can be omitted.
For the sake of rounding according to IEEE's standard, there are generated round bit and sticky bit as follows.
If the fractions A and B are multiplied together in two floating point numbers, the sum of 2n bit, S=s.sub.2n-1 s.sub.2n-2 . . . s.sub.0 and carry, C=c.sub.2n-1 c.sub.2n-2 . . . c.sub.0 are generated. After the S and C are respectively generated, a result F of 2n bit is generated as a result of addition of S anc C. The result F includes high n+1 bit of the fraction in the floating point and low n-1 bit of omitted portion. The rounding is based on low n-1 bit of the result F.
This information can be expressed by round bit and sticky bit. The rounding mode designated by IEEE's standard can be performed by these bits. The round bit R is the MSB of low n-1 bit from the result F and the sticky bit Sy is an ORed operation value for low n-2. Therefore, the result F can be expressed as follows.
&lt;Equation 2&gt; EQU F=C+S EQU =(c.sub.2n-1 c.sub.2n-2 . . . c.sub.2n-1 c.sub.2n-2 . . . )+(s.sub.2n-1 s.sub.2n-2 . . . s.sub.n-1 s.sub.n-2 . . . ) EQU =f.sub.2n-1 f.sub.2n-2 . . . f.sub.n-1 f.sub.n-2 . . . f.sub.0 EQU =f.sub.2n-1 f.sub.2n-2 . . . f.sub.n-1 RSy
In IEEE's standard, there are four rounding methods, i.e., round-to-nearest, round-to-zero, round-to-positive-infinity, and round-to-negative-infinity.
The four rounding methods are shown in the following tables 1, 2, 3 and 4.
IEEE's rounding mode according to signs is shown in the table 1.
TABLE 1 IEEE rounding modes positive number negative number round-to-nearest round-to-nearest round-to-zero round-to-zero round-to-positive- round-to-infinity round-to-zero infinity round-to-negative- round-to-zero round-to-infinity infinity
The rounding results of the round-to-nearest for the LSB, R, and Sy are shown in the table 2.
TABLE 2 Round-off LSB Round bit Sticky bit result 0 0 0 truncation 0 0 1 truncation 0 1 0 truncation 0 1 1 increment 1 0 0 truncation 1 0 1 truncation 1 1 0 increment 1 1 1 increment
The rounding results of the round-to-zero for R and Sy are shown in the table 3.
TABLE 3 Round-off Round bit Sticky bit result 0 0 truncation 0 1 truncation 1 0 truncation 1 1 truncation
The rounding results of the round-to-infinity for R and Sy are shown in the table 4.
TABLE 4 Round-off Round bit Sticky bit result 0 0 truncation 0 1 increment 1 0 increment 1 1 increment
The tables 2 to 4 show the rounding results of the round-to-nearest, round-to-zero, and round-to-infinity for the LSB, R, and Sy of the fractions generated after the steps of multiplication, addition, and normalization excluding rounding in the floating point multiplication operation.
FIG. 1 is a block diagram illustrating process steps of a fraction portion in a conventional floating point multiplier. The process steps of the fraction portion includes multiplication, addition, rounding, and normalization.
The conventional floating point multiplier includes a modified booth encoder (not shown), a Wallace tree/array 10, a sticky bit generator 20, a carry select adder 30, and a C.sub.in generator 40. The modified booth encoder generates partial products from two n-bit binary operand. The Wallace tree/array 10 generates n+2 MSB carry/sum bit and n-2 LSB carry/sum bit from the partial products. The sticky bit generator 20 generates sticky bit Sy as compensation information for data loss of the fraction portion from the n bit binary. The carry select adder 30 adds the n+2 MSB carry/sum bit of the Wallace tree/array 10. The C.sub.in generator 40 generates only carry value from the n-2 LSB carry/sum bit. The results of n bit are output after the rounding step of the results of the carry select adder 30 and normalization step.
The steps of multiplication, subtraction, rounding, and normalization will be described in detail.
First, in the multiplication step, partial products generated by the modified booth encoder are calculated into sum and carry of 2n bit using the Wallace tree array 10.
Since the addition step requires the high n+2 bit, the results of the addition for the low n-2 are not required. Therefore, addition of carry and sum of high n+2 bit generated in the multiplication step is only required and carry generated by the results due to addition of carry and sum for the low n-2 bit only influences addition of the high n+2 bit.
As a result of addition of carry and sum for the low n-2 bit, if the carry is 1, 1 is added to the results of addition of carry and sum for the high n+2 bit. While, if the carry is 0, 0 is added to the results of addition of carry and sum for the high n+2 bit. The addition of carry and sum for the high n+2 bit can be realized by the carry select adder 30. The addition of carry and sum for the low n-2 can be realized by the C.sub.in generator 40 which serves as a logic circuit which generates only carry for the addition of low n-2 carry and sum. Therefore, 2n bit adder can be replaced with the C.sub.in generator 40. The results of the addition can be expressed as follows.
&lt;Equation 3&gt; EQU f.sub.2n-1 . . . f.sub.n-1 =(c.sub.2n-1 . . . c.sub.n-2)+(s.sub.2n-1 . . . s.sub.n-2)+c.sup.in.sub.n-2
Where c.sup.in.sub.n-2 is the overflow value after cn.sub.n-3 . . . c.sub.0 plus S.sub.n-3 . . . S.sub.0. At this time, c.sup.in.sub.k is k bit carry from high k-1st bit.
If it is defined as D=c.sub.2n-1 . . . c.sub.n-1 +s.sub.2n-1 . . . s.sub.n-1, f.sub.2n-1 . . . f.sub.n-1 can be expressed as follows.
&lt;Equation 4&gt; EQU f.sub.2n-1 . . . f.sub.n-1 =(c.sub.2n-1 . . . c.sub.n-2)+(s.sub.2n-1 . . . s.sub.n-2)+c.sup.in.sub.n-2 =D+c.sup.in.sub.n-1
Where, carry c.sup.in.sub.n-1 =overflow(c.sub.n-2 +s.sub.n-2 +c.sup.in.sub.n-2).
The overflow Z returns 1 if the overflow occurs as a result of operation of Z. While the overflow Z returns 0 if not so.
In the rounding step, if the MSB value after addition step is f.sub.2n-1 =1, the result of the rounding is added to f.sub.2n-1 . . . f.sub.n. If the MSB value after addition step is f.sub.2n-1 =0, the result of the rounding is added to f.sub.2n-2 . . . f.sub.n-1. At this time, if overflow occurs in the normalization step, shift to the right by 1 bit is required and also the exponent increment is required. If the overflow does not occur, shift is not required. In case of one, it is expressed as right shift (RS). In case of the other, it is expressed as no shift (NS).
Sticky bit Sy which determines the result of the rounding becomes 0 if the sum of trailing-zero of two fraction portions input to the floating point multiplier is greater than n-2. While the sticky bit Sy becomes 1 if the sum of trailing-zero of two fraction portions is smaller than n-2. The sticky bit Sy is obtained in parallel when carry and sum are generated by multiplying the two fraction portions. It is assumed that the result value after rounding step in case of NS is Q.sup.NS and the result value after rounding step in case of RS is Q.sup.RS. In this case, the position of rounding in case of NS is f.sub.n-1 and the position of rounding in case of RS is f.sub.n-2. Therefore, significant position in case of RS is higher by 1 bit than that in case of NS. The result values Q.sup.NS and Q.sup.RS can be expressed as follows.
&lt;Equation 5&gt; EQU Q.sup.NS =(f.sub.2n-1 . . . f.sub.n-1)+rounding.sub.mode (f.sub.n-1,R,Sy) EQU Q.sup.RS =(f.sub.2n-1 . . . f.sub.n-1)+2.times.rounding.sub.mode (f.sub.n, f.sub.n-1,R{character pullout}Sy)
Where, the rounding.sub.mode (f.sub.n-1,R,Sy) means the result of rounding for a corresponding rounding mode. The rounding mode has 1 if the result of rounding is carry, while the rounding mode has 0 if not. The input parameters of rounding.sub.mode (f.sub.n,f.sub.n-1,R{character pullout}Sy) is shifted formats of input parameters of NS case to the right by 1 bit during RS.
Finally, in the normalization step, 1 bit shift to the right is performed if the MSB of the result of the rounding is 1 while high n bit is output without shift if the MSB is 0. At this time, "{character pullout}" denotes AND operation, "{character pullout}" denotes OR operation, ".sym." denotes exclusive OR operation, and "{character pullout}" denotes exclusive NOR operation.
In the steps of multiplication, addition of carry and sum, normalization, and rounding generated by the conventional floating point multiplication operation, or in the steps of multiplication, addition, rounding, and normalization, a separate high speed incrementer or adder is used for process of rounding. In addition, for the steps of multiplication, addition, normalization, and rounding, a separate hardware is required for renormalization due to overflow during rounding. For the steps of multiplication, addition, rounding, and normalization, a separate hardware is required for performing rounding prior to normalization. For this reason, an area of the arithmetic unit becomes large and operation process time becomes longer.