The squaring for scientific engineering calculations of values expressed as floating-point numbers is frequently performed using a computer, and the capability of a computer to perform the required calculations is greatly affected by the processing speed of the multiplier provided for the squaring of the floating-point numbers. For this reason, various devices have been devised to improve the processing speeds of multipliers used for squaring floating-point numbers.
An explanation will now be given for the square calculation of a floating-point number using an electronic circuit and a conventional method employed for improving calculation speed.
For the multiplication of a floating-point number, two processes are required: the multiplication of numerical values, and the rounding off of the product that is performed. Usually, the multiplication of numerical values is the process used for conventional devices designed to speed up squaring calculations performed for floating-point numbers.
First, an explanation will be given for the multiplication of eight-bit numbers represented by a (=a7, a6, a5, a4, a3, a2, a1 and a0) and b (=b7, b6, b5, b4, b3 b2, b1 and b0).
FIG. 6 is a diagram for explaining the multiplication of the numbers a and b. As is shown in FIG. 6, when a and b are multiplied, first, 64 (=8×8) product terms of a0b0 to a7b7 are generated for the individual bits of these numbers, and are sequentially added together. A multiplier for performing this calculation is constituted as an established method for a circuit technique by using a Wallace tree and a binary adder.
For a squaring calculation, two like numbers are multiplied, and when floating-point numbers are multiplied, the most significant bit (MSB) is always “1”. Therefore, the squaring multiplication of the number a, consisting of eight, eight bit numbers, in FIG. 6 is performed as is shown in FIG. 7, where b=a and a7=1. For the product terms in FIG. 7,aiai=ai  (a)aiaj=ajai  (b)are established.
In equation (a), since like terms are multiplied, an AND gate is not required.
In equation (b), since the product term aiaj corresponds to the product term ajai, it is therefore found that when these two product terms are added at the same position, they need only be collated to form a single product term in order to be inserted in a one level higher position.
Conventionally, there is a well known method for whereby a Wallace tree can be simplified by using the symmetry of the product terms in a squaring multiplier. FIG. 8 is a diagram showing the state wherein the Wallace tree is simplified by using the symmetry of the product terms used for the squaring calculation in FIG. 7 to reduce the number of product terms.
In FIG. 8, for example, since the product term at position s0 is only a0a0, equation (a) can be applied for this product term, and therefore, a0 is entered unchanged at position s0.
Then, since the product terms at position s1 are a1a0 and a0a1, equation (b) can be applied for these product terms, and therefore, the product term a1a0, obtained by collating the above product terms, is carried over and entered at one higher position, s2.
At position s2, there are three product terms, a2a0, a1a1 and a0a2. For these product terms, equation (a) can be applied for product term a1a1, and equation (b) can be applied for product terms a2a0 and a0a2. Therefore, at position s2, by applying equation (a) for a1a1, a1 is entered, and a2a0, obtained by applying equation (b) for the terms a2a0 and a0a2, is carried over and entered at position s3.
As a result, the 64 product terms in FIG. 7 are reduced to 36. And since the number of product terms is reduced, accordingly, the number of arithmetic units constituting the squaring multiplier and the circuit size are also reduced. Thus, the accumulated processing delay is decreased and the processing speed of the squaring multiplier is increased.
For a binary adder for calculating the above product terms, a circuit technique, called a Carry Look Ahead (CLA), is available that uses a combinational circuit to generate a higher carry from a lower carry. This Carry Look Ahead circuit technique can reduce the delay resulting from the addition process performed by the adder.
Furthermore, as is described above, since when floating-point numbers are multiplied the number of effective input bits equals the number of effective output bits, a rounding off process is performed for the addition results obtained for the numerical values.
FIG. 9 is a flowchart for explaining the multiplication processing, including the rounding off process.
In FIG. 9, during the multiplication of floating-point numbers, first, addition is performed using the above method (step 901), and based on the results, the location of the MSB of the mantissa is established (step 902). Then, based on the location of the MSB, the location of a guard bit is established (step 903), and a round-off bit, a target for the rounding off process, is established (step 904). Thereafter, the rounding off process is actually performed for the round-off bit that is obtained a result of the addition performed at step 901 (step 905). When a carry is generated as a result of the rounding off process, a “1” is added to the value of the exponential portion (step 906).
The above described calculation method and rounding off method used for floating-point numbers conform to standard IEEE (Institute of Electrical and Electronics Engineers) 754.
As is described above, various devices have been provided for increasing the processing speed of squaring multipliers for floating-point numbers. But even so, currently, in line with requests that the processing capabilities of computers be improved, even greater increases are being sought for squaring multipliers for floating-point numbers.
It is, therefore, one object of the present invention to provide a squaring multiplier for floating-point numbers for which the number of constituent arithmetic units is reduced by locally compressing the addition of the floating-point numbers (the addition of mantissas), and to provide increased processing speeds.
It is another object of the present invention to increase the processing speeds of squaring multipliers for floating-point numbers by performing in parallel the addition of floating-point numbers and the rounding off process performed for the addition results.