1. Field of the Invention
The present invention relates to a square root extraction algorithm and a square root extraction circuit used for three-dimensional graphics processing which requires numerical calculations, particularly vector normalization.
2. Description of the Background Art
Graphics processing employing vector normalization, principally light source calculations, uses the result of vector normalization (X/SQRT(X) where X is a vector and SQRT(X) is the square root of X) for processing Thus, the increase in operation speed of the normalization is significant to increase the light source calculating speed. Attempts have been made to implement a square root extraction operation via software or special-purpose hardware- The software for the square root extraction operation requires no special hardware structure and hence necessitates no consideration for a circuit size (costs) when the LSI technique is applied thereto, but requires a large number of repetitive operations using an approximation algorithm. For this reason, the special-purpose hardware is used when a higher priority is given to a processing speed.
However, a conventional square root extraction circuit employing the square root extraction algorithm which determines conventional non-recovery type square roots has a hardware structure as disclosed in xe2x80x9cComputer High-speed Operation System,xe2x80x9d Kindai Kagaku Sha Co., Ltd. Thus, to determine an N-digit square root, the conventional square root extraction circuit is subject to the following restrictions:
(1) Nxc2x7(N+1)/2 adders are required.
(2) CAS cells (controllable add/subtract cells) must be used which have a more complicated internal structure as one-unit adders than do full adders.
(3) The operation of a digit of a given significance is not permitted to start until a carry output from the highest-order adder for the digit of the next higher significance (an extracted square root output for that digit) is determined. This decreases the operation speed.
The drawback (2) is described in detail hereinafter.
The CAS cell is a 4-input 4-output controllable add/subtract cell which receives data inputs A, B, a carry input CI, and a control input P to provide an addition (subtraction) output S and a carry output CO which satisfy the conditions described below, a data output B (equal to the data input B), and a control output P (equal to the control input P).
S=A{circumflex over ( )}(B{circumflex over ( )}P){circumflex over ( )}CI
CO=(A+C)xc2x7(B{circumflex over ( )}P)+Axc2x7C
The symbol xe2x80x9c{circumflex over ( )}xe2x80x9d means an exclusive-OR operation. The control input (output) P indicates an addition when it is xe2x80x9c0xe2x80x9d, and indicates a subtraction when it is xe2x80x9c1xe2x80x9d. In this manner, the CAS cell is a circuit which functions to perform a 1-bit addition/subtraction.
To determine the binary square root Q={0.q1 q2 q3 q4}2 of a binary number A={0.a1 a2 a3 a4 a5 a6 a7 a8}2, the conventional square root extraction algorithm determines whether the calculation for a digit of a given significance q(i+1) employs an addition or a subtraction, depending upon whether the value of the output digit of the next higher significance q(i) is xe2x80x9c1xe2x80x9d or xe2x80x9c0xe2x80x9d. Thus, the conventional square root extraction circuit constructed such that the value of the square root extraction output digit of a given significance q(i) selectively determines the operation contents (addition or subtraction) in the CAS cells for the digit of the next lower significance q(i+1) is slow in operation speed and requires the CAS cells having the 1-bit addition/subtraction function
FIG. 22 is a diagram of a square root extraction circuit employing the conventional algorithm.
As illustrated, two CAS cells are used for the output q1, four CAS cells for the output q2, six CAS cells for the output q3, and eight CAS cells for the output q4. In FIG. 22, an input shown as given to the middle of the top side of the block of each CAS cell corresponds to the data input A, an input shown as given obliquely to the upper-left corner of the block corresponds to the data input B, an input shown as given across the block corresponds to the control input P, an input shown as given to the right side of the block corresponds to the carry input CI, an output shown as provided from the left side of the block corresponds to the carry output CO, and an output shown as provided from the middle of the bottom side of the block corresponds to the addition (subtraction) output S. The CAS cell has a greater circuit size than that of a full adder and a half adder which are simple in construction, resulting in a complicated circuit structure of the conventional square root extraction circuit.
A first aspect of the present invention is intended for a square root extraction circuit for calculating binary input data (0.a(1) a(2) a(3) . . . a(n)) using a square root extraction algorithm to output binary square root data (0.q(1) q(2) q(3) . . . q(m)), the square root extraction algorithm including an algorithm for determining the square root data on the basis of the input data by only additions of square root partial data q(1) to q(m) in q(1) to q(m) order. According to the present invention, the square root extraction circuit comprises: first to mth digit calculating portions each including a plurality of adders connected in series so that carries are propagated therethrough, wherein respective ones of the adders which are connected in the last position in the first to mth digit calculating portions provide carry outputs serving as the square root partial data q(1) to q(m), respectively, in accordance with the square root extraction algorithm.
A second aspect of the present invention is intended for a square root extraction circuit for calculating binary input data (0).a(1) a(2) a(3) . . . a(n)) using a square root extraction algorithm to output binary square root data (0.q(1) q(2) q(3) . . . q(m)), the square root extraction algorithm including an algorithm for determining the square root data on the basis of the input data by only additions of square root partial data q(1) to q(m) in q(1) to q(m) order, the algorithm having preceding digit based operation portions for performing operations to output the square root partial data q(2) to q(m) by using the square root partial data q(1) to q(mxe2x88x921) provided in their preceding digit positions as operation parameters. According to the present invention, the square root extraction circuit comprises: first to mth digit calculating portions including at least first to mth adder groups, respectively, each of the first to inth adder groups including a plurality of adders connected in series so that carries are propagated therethrough, wherein respective ones of the adders which are connected in the last position in the first to (pxe2x88x921)th digit calculating portions (2xe2x89xa6pxe2x89xa6m) provide carry outputs serving as the square root partial data q(1) to q(pxe2x88x921), respectively, in accordance with the square root extraction algorithm, and wherein the preceding digit based operation portions of the pth to mth digit calculating portions include carry output prediction circuits for performing logic operations based on the carry outputs from respective ones of the adders which are connected in the last position in the adder groups thereof and the square root partial data q(pxe2x88x921) to q(mxe2x88x921) provided in their preceding digit positions to output the square root partial data q(p) to q(m), respectively.
Preferably, according to a third aspect of the present invention, the square root extraction circuit of the second aspect further comprises: a rounding circuit for rounding square root data (0.q(1) q(2) q(3) . . . q(kxe2x88x921)) (pxe2x89xa6kxe2x89xa6m) based on the square root partial data q(k) to q(m) outputted from the carry output prediction circuits of the kth to mth digit calculating portions to output rounded square root data (0.r(1) r(2) r(3) . . . r(kxe2x88x921)).
Preferably, according to a fourth aspect of the present invention, in the square root extraction circuit of the second aspect, each of the second to mth adder groups comprises at least a pair of adders receiving respective external data, and at least a pair of adders each having a first input receiving an addition result from an adder included in an adder group provided in its preceding digit position, the two pairs of adders being connected in series so that carries are propagated therethrough; the carry output prediction circuit of the pth digit calculating portion performs a logic operation based on addition result information containing information associated with at least an addition result from the adder connected in the last position in the (pxe2x88x921)th adder group in addition to the carry output from the adder connected in the last position in the pth adder group and the square root partial data q(pxe2x88x921) provided in its preceding digit position, thereby to output the square root partial data q(p) and addition result information of the pth digit calculating portion; and the carry output prediction circuit of the ith digit calculating portion ((p+1)xe2x89xa6ixe2x89xa6m) performs a logic operation based on an addition result from the adder connected in the last position in the (ixe2x88x921)th adder group and the addition result information of the (ixe2x88x921)th digit calculating portion in addition to the carry output from the adder connected in the last position in the ith adder group and the square root partial data q(ixe2x88x921) provided in its preceding digit position, thereby to output the square root partial data q(i) and addition result information of the ith digit calculating portion.
Preferably, according to a fifth aspect of the present invention, in the square root extraction circuit of the second aspect, each of the second to mth adder groups comprises at least a pair of adders receiving respective external data, and at least a pair of adders each having a first input receiving an addition result from an adder included in an adder group provided in its preceding digit position, the two pairs of adders being connected in series so that carries are propagated therethrough; the carry output prediction circuit of the pth digit calculating portion performs a logic operation based on addition result information containing information associated with at least an addition result from the adder connected in the last position in the (pxe2x88x921)th adder group in addition to the carry output from the adder connected in the last position in the pth adder group and the square root partial data q(pxe2x88x921) provided in its preceding digit position, thereby to output the square root partial data q(p) and addition result information of the pth digit calculating portion; the carry output prediction circuit of the ith digit calculating portion ((p+1)xe2x89xa6ixe2x89xa6(mxe2x88x921)) performs a logic operation based on an addition result from the adder connected in the last position in the (ixe2x88x921)th adder group and the addition result information of the (ixe2x88x921)th digit calculating portion in addition to the carry output from the adder connected in the last position in the ith adder group and the square root partial data q(ixe2x88x921) provided in its preceding digit position, thereby to output the square root partial data q(i) and addition result information of the ith digit calculating portion; and the carry output prediction circuit of the mth digit calculating portion performs a logic operation based on an addition result from the adder connected in the last position in the mth adder group and the addition result information of the (mxe2x88x921)th digit calculating portion in addition to the carry output from the adder connected in the last position in the (mxe2x88x921)th adder group and the square root partial data q(mxe2x88x921) provided in its preceding digit position, thereby to output only the square root partial data q(m).
Preferably, according to a sixth aspect of the present invention, in the square root extraction circuit of the fourth aspect, the carry output prediction circuit of the ith digit calculating portion ((p+1)xe2x89xa6ixe2x89xa6m) comprises: logic operation means for performing the logic operation based on the addition result from the adder connected in the last position in the (ixe2x88x921)th adder group and the addition result information of the (ixe2x88x921)th digit calculating portion to output a plurality of logic results; and selection means for selectively outputting one of the logic results as the square root partial data q(i) and another one of the logic results as the addition result information of the ith digit calculating portion on the basis of the carry output from the adder connected in the last position in the ith adder group and the square root partial data q(ixe2x88x921) provided in its preceding digit position.
Preferably, according to a seventh aspect of the present invention, in the square root extraction circuit of the sixth aspect, the selection means receives the carry output having a negative logic from the adder connected in the last position in the ith adder group.
Preferably, according to an eighth aspect of the present invention, in the square root extraction circuit of the second aspect, the square root extraction algorithm includes a step for adding fixed values to be added; and a fixed addition result is directly applied to an adder in each of the first to mth digit calculating portions without using an adder for adding the fixed values.
A ninth aspect of the present invention is intended for a floating-point square root extraction device for performing a square root extraction operation on floating-point input data including a mantissa and an exponent to output floating-point output data. According to the present invention, the floating-point square root extraction device comprises: exponent square root extraction means receiving exponent input data for performing the square root extraction operation on the exponent input data to output exponent square root data; a square root extraction circuit for calculating binary input data associated with mantissa input data (0.a(1) a(2) a(3) . . . a(n)) using a square root extraction algorithm to output mantissa square root data (0.q(1) q(2) q(3) . . . q(m)), the square root extraction algorithm including an algorithm for determining the mantissa square root data on the basis of the input data by only additions of square root partial data q(1) to q(m) in q(1) to q(m) order, the algorithm having preceding digit based operation portions for performing operations to output the square root partial data q(2) to q(m) by using the square root partial data q(1) to q(mxe2x88x921) provided in their preceding digit positions as operation parameters, the square root extraction circuit comprising first to mth digit calculating portions including at least first to mth adder groups, respectively, each of the first to mth adder groups including a plurality of adders connected in series so that carries are propagated therethrough, wherein respective ones of the adders which are connected in the last position in the first to (pxe2x88x921)th digit calculating portions (2xe2x89xa6pxe2x89xa6m) provide carry outputs serving as the square root partial data q(1) to q(pxe2x88x921), respectively, in accordance with the square root extraction algorithm, and wherein the preceding digit based operation portions of the pth to mth digit calculating portions include carry output prediction circuits for performing logic operations based on the carry outputs from respective ones of the adders which are connected in the last position in the adder groups thereof and the square root partial data q(pxe2x88x921) to q(mxe2x88x921) provided in their preceding digit positions to output the square root partial data q(p) to q(m), respectively, the floating-point square root extraction device further comprising floating-point data output means for outputting the floating-point output data including exponent output data and mantissa output data on the basis of the exponent square root data and the mantissa square root data.
Preferably, according to a tenth aspect of the present invention, in the floating-point square root extraction device of the ninth aspect, the floating-point data output means includes output selection means receiving input data information indicating whether the floating-point input data is a normalized number or an unnormalized number, the output selection means for forcing the exponent output data to be xe2x80x9c0xe2x80x9d to output only the mantissa output data as the floating-point output data when the input data information indicates the unnormalized number.
Preferably, according to an eleventh aspect of the present invention, the floating-point square root extraction device of the ninth aspect further comprises: data shift means for performing a predetermined data shift processing on the mantissa input data to apply the resultant data as the binary input data to the square root extraction circuit when the exponent input data is an odd number, wherein the exponent square root extraction means includes: preliminary exponent square root extraction portion for performing a predetermined change-to-even-number processing on the exponent input data to provide an even number when the exponent input data is an odd number, the preliminary exponent square root extraction portion thereafter dividing the even number by 2 to output preliminary exponent square root data, the change-to-even-number processing and the predetermined data shift processing being performed so that the value of the floating-point input data is not changed, and an exponent square root data output portion for modifying the preliminary exponent square root data on the basis of rounding-based carry information to output the exponent square root data, and wherein the floating-point data output means includes mantissa data rounding means for rounding more significant digits of the mantissa square root data on the basis of a less significant digit of the mantissa square root data to output the mantissa output data and to output the rounding-based carry information indicating whether or not the mantissa square root data has a carry during rounding.
Preferably, according to a twelfth aspect of the present invention, in the floating-point square root extraction device of the eleventh aspect, the preliminary exponent square root extraction portion and the exponent square root data output portion are formed integrally.
As above described, the square root extraction circuit in accordance with the first aspect of the present invention uses the carry outputs from the adders connected in the last position in the first to mth digit calculating portions as the square root partial data q(1) to q(m), respectively, in accordance with the square root extraction algorithm for determining the square root data based on the input data only by the additions of the square root partial data q(1) to q(m) in q(1) to q(m) order. The square root extraction circuit is implemented using only the existing half adders and full adders to achieve a simple circuit structure.
The square root extraction circuit in accordance with the second aspect of the present invention uses the carry outputs from the adders connected in the last position in the first to (pxe2x88x921)th digit calculating portions as the square root partial data q(1) to q(pxe2x88x921), respectively, in accordance with the square root extraction algorithm for determining the square root data based on the input data only by the additions of the square root partial data q(1) to q(m) in q(1) to q(m) order. The pth to mth digit calculating portions include the carry output prediction circuits for performing the logic operations based on the carry outputs from the adders connected in the last position in the adder groups thereof and the square root partial data q(pxe2x88x921) to q(mxe2x88x921) provided in their preceding digit positions to output the square root partial data q(p) to q(m), respectively.
The square root extraction circuit of the second aspect, similar to that of the first aspect, is implemented using only the existing half adders and full adders to achieve a simple circuit structure.
Additionally, when the preceding digit based operation portion requires a plurality of additions using the square root partial data provided in the preceding digit position as the operation parameter, the preceding digit based operation portion may be comprised of only the single carry output prediction circuit. This allows the single carry output prediction circuit to perform the function of a conventional in-series connection of a plurality of adders for implementing the plurality of additions, accomplishing a more simplified circuit structure.
Although the plurality of adders connected in series must propagate carries therethrough, the single carry output prediction circuit may perform the logic operation without the carry propagation, improving the operation speed.
The square root extraction circuit in accordance with the third aspect of the present invention further comprises the rounding circuit for rounding the square root data based on the square root partial data q(k) to q(m) outputted from the carry output prediction circuits of the kth to mth digit calculating portions. This provides the output of the square root data with the rounding function.
In the square root extraction circuit in accordance with the fourth aspect of the present invention, the carry output prediction circuit of the ith digit calculating portion ((p+1)xe2x89xa6ixe2x89xa6m) performs the logic operation based on the addition result from the adder connected in the last position in the (ixe2x88x921)th adder group and the addition result information of the (ixe2x88x921)th digit calculating portion in addition to the carry output from the adder connected in the last position in the ith adder group and the square root partial data q(ixe2x88x921), thereby to output the square root partial data q(i) and the addition result information of the ith digit calculating portion. Thus, the carry output prediction circuits of the (p+1)th to mth digit calculating portions may be implemented by the circuits which perform the same logic operation. The circuit size of the carry output prediction circuits is not increased if the number of digits of the square root data increases.
In the square root extraction circuit in accordance with the fifth aspect of the present invention, the carry output prediction circuit of the inth digit calculating portion performs the logic operation based on the addition result from the adder connected in the last position in the mth adder group and the addition result information of the (mxe2x88x921)th digit calculating portion in addition to the carry output from the adder connected in the last position in the (mxe2x88x921)th adder group and the square root partial data q(mxe2x88x921), thereby to output only the square root partial data q(m).
Thus, the carry output prediction circuit of the mth digit calculating portion should perform the logic operation which outputs only the square root partial data q(m), thereby to be of a more simplified circuit construction than other carry output prediction circuits.
In the square root extraction circuit in accordance with the sixth aspect of the present invention, the selection means selectively outputs one of the logic results as the square root partial data q(i) and another one of the logic results as the addition result information of the ith digit calculating portion on the basis of the carry output from the adder connected in the last position in the ith adder group and the square root partial data q(ixe2x88x921).
The carry output from the adder connected in the last position in the ith adder group and the square root partial data q(ixe2x88x921) which require relatively long time to be determined are used as selection control signals after the logic operation means provides the plurality of logic results. This increase the efficiency of the processing to improve the operation speed.
The logic operation means of the square root extraction circuit in accordance with the seventh aspect of the present invention receives the carry output having the negative logic from the adder connected in the last position in the ith adder group, requiring only one inverter to buffer the carry output.
In the square root extraction circuit in accordance with the eighth aspect of the present invention, the fixed addition result is directly applied to the adder in each of the first to mth digit calculating portions without using an adder for adding the fixed values. This provides for a more simplified circuit structure.
The floating-point square root extraction device in accordance with the ninth aspect of the present invention comprises the square root extraction circuit of the first or second aspect to simplify the circuit structure of the square root extraction circuit. The use of the square root extraction circuit of the second aspect improves the operation speed of the mantissa output data.
In the floating-point square root extraction device in accordance with the tenth aspect of the present invention, the output selection means forces the exponent output data to be xe2x80x9c0xe2x80x9d to output only the mantissa output data as the floating-point output data when the input data information indicates the unnormalized number. This enables the square root extraction operation of the floating-point input data which is the unnormalized number.
The floating-point square root extraction device in accordance with the eleventh aspect of the present invention further comprises the data shift means for performing the predetermined data shift processing on the mantissa input data to apply the resultant data as the binary input data to the square root extraction circuit when the exponent input data is an odd number. The exponent square root extraction means includes the preliminary exponent square root extraction portion for performing the predetermined change-to-even-number processing on the exponent input data to provide an even number when the exponent input data is an odd number, the preliminary exponent square root extraction portion thereafter dividing the even number by 2 to output the preliminary exponent square root data. The change-to-even-number processing and the predetermined data shift processing are performed so that the value of the floating-point input data is not changed. This provides the efficient execution of the square root extraction operation by the preliminary exponent square root extraction portion without impairing the operation accuracy.
In the floating-point square root extraction device in accordance with the twelfth aspect of the present invention, the preliminary exponent square root extraction portion and the exponent square root data output portion are formed integrally. This accordingly simplifies the circuit structure.
It is therefore an object of the present invention to provide a square root extraction circuit which achieves a simplified circuit structure and a higher operation speed.
These and other objects, features, aspects and advantages of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings.