1. Field of the Invention
The present invention relates to a floating-point arithmetic system for performing various arithmetic operations, including addition and subtraction calculations, on a lot of data, i.e., various kinds of numbers, which are represented in a floating-point format, where each of the numbers is represented as an exponent and a mantissa.
More specifically, the present invention relates to a floating-point arithmetic system which enables large amount of data to be processed at high speed by utilizing a floating-point format arithmetic means, which easily covers a wide range of numbers and which can be easily used by various kinds of measuring instruments, various kinds of digital circuits, and the like.
2. Description of the Related Art
With the recent progress of computer systems, the range of values of data (numbers), which can be handled in various kinds of measuring instruments, various kinds of digital circuits, and the like, tends to become wider and wider. To meet such a tendency, the arithmetic operations on numbers in the floating-point format (also referred to as floating-point data) which covers the wide range are likely to be generally used, rather than operations on data in the fixed-point representation.
Further, as the technical field, to which these arithmetic operations on data in the floating-point representation can be applied, is extended even toward a design of digital filter for processing digital signals or a design of spatial filter for image processing, the amount of data for which various arithmetic operations must be executed is likely to increase. Furthermore, it is also necessary for such arithmetic operations to be performed at relatively high speed, since these digital signals must be processed in a real-time operation by means of digital filters, etc. In other words, it has become necessary to provide a floating-point arithmetic system which is capable of rapidly processing large amounts of data (multiple-input data), in the floating-point representation, in a short time.
In a floating-point arithmetic system according to the prior art, a plurality of two-input-type floating-point adder-subtractors, in which only two kinds of floating-point data are allowed to be input and addition and subtraction calculations thereof are executed, are connected in cascade. Further, by adequately combining these two-input-type floating-point adder-subtractors and a plurality of multipliers and a plurality of dividers with each other, a floating-point arithmetic system in which multiple-input data can be sequentially processed at evey adder-subtractor is finally constructed.
In such a construction, to realize data processing at relatively high speed, it is necessary for the operating speed of each arithmetic unit, such as two-input-type floating-point adder-subtractors, multipliers and dividers, to be sufficiently high. Until now, with the progress of technology for fabricating an LSI (Large Scale Integrated Circuit), the improvement of operating speed of such two-input-type floating-point adder-subtractors, etc., has been performed relatively smoothly.
However, as described above, since arithmetic operations on data in the floating-point format have tended to be also applied to the field concerning a design of digital filter or a design of spatial filter for image processing, it is urgently required that larger amounts of data should be processed at very high speed. To satisfy such requirement, the amount of hardware must increase due to an increase in the number of the two-input-type floating-point adder-subtractors, and therefore the delay time caused by these floating-point adder-subtractors is not negligible.
Here, to clarify some problems regarding the prior art, the concrete construction of typical conventional floating-point arithmetic systems will be described with reference to the related drawings of FIGS. 1 to 9.
FIG. 1 is a diagram showing some examples in which data in the normalized floating-point representation are usually indicated; FIGS. 2(A) and 2(B) are flowcharts for each explaining the adding process of floating-point data according to the prior art; and FIG. 3 is a diagram showing some examples in which addition and subtraction calculations of floating-point data are executed by the process of FIGS. 2(A) and 2(B).
Typically, a value of floating-point data is indicated in a format as shown in FIG. 1. In FIG. 1, E denotes an exponent which is indicated in an offset representation. Namely, when a value of any exponent is indicated by means of six binary bits, "011111" denotes a zero ("0") that is a middle value, "000000" denotes "-31" that is a minimum value, and "111111" denotes "31" that is a maximum value. Further, S denotes a sign bit. If S=0, the sign bit represents a positive number, and if S=1, the sign bit represents a negative number. F denotes a mantissa which is indicated in two's complement representation.
The relationship between the respective actual values of X, F and E is represented as follows, by utilizing a normalization process.
A positive number X in the case where S=0 means that X=01. F*2.sup.E, while a negative number X in the case where S=1 means that X=10. F*2.sup.E. Further, if X is actually "0", the value of X is indicated as E=-31, S=0 and F=0.
In performing arithmetic operations for such floating-point data, according to the prior art, at least one arithmetic unit of the two-input-type, such as two-input-type floating-point adder-subtractor, is utilized. For example, an addition calculating by means of a two-input-type arithmetic unit is executed in accordance with the process of steps S1a to S9 (the following procedures 1-9) shown in FIGS. 2(A) and 2(B). Here, is should be noted that the subtracting process is also executed by the process similar to the adding process of steps S1a to S9. Further, it should be noted that the description of the case where only the adding process of only two floating-point numbers is executed will be made with reference to FIGS. 2(A) and 2(B), in order to simply the explanation of arithmetic operations.
Further, the adding process should be illustrated in one drawing of FIG. 2. However, in this case, since it is difficult for FIG. 2 to be contained in one sheet, FIG. 2 is divided into two drawings of FIGS. 2(A) and 2(B). FIG. 2(A) includes steps S1a to S4, while FIG. 2(B) includes steps S5 to S9.
1 Steps S1a and S1b . . . two floating-point numbers on which an addition is to be executed are assumed to be A, B. Further, exponents of these values A, B are assumed to be A 9(exp), B(exp), respectively, while mantissas of these values A, B are assumed to be A(man), B(man), respectively.
2 Step S2 . . . an exponent of numbers B is subtracted from an exponent of numbers A. Further, a comparison of the magnitude of the exponents is made between the numbers A and B, and it is determined which exponent has larger value by discriminating whether a result of the above-mentioned subtraction E.sub.cmp becomes positive or negative. Consequently, the exponent having a larger value is defined as C(exp).
3 Step S3 . . . in accordance with the result of comparison in Step S2, digits of the mantissa corresponding to the exponent having a smaller value are shifted so that the digit positions of the above-mentioned mantissa can be adjusted to those of another mantissa corresponding to the exponent having a larger value.
4 Step S4 . . . after the adjustment process of the two mantissas in Step S3, the respective mantissas of the numbers A, B are added together. Further, a result of such an addition of these mantissas is defined as C(man) {=A(man)+B(man)}.
5 Step S5 . . . it is determined whether or not exception processing regarding the result of the addition of these mantissas C(man) in Step S4 is necessary. If the result of the addition of these mantissas C(man) becomes "0", or if it is unnecessary to execute the shift with respect to the result C(man) for normalization process, the addition process advances from S5 to Step S7, not via Steps S6a and S6b.
6 Steps S6a, S6b . . . if it is necessary to execute the right shift by 1 bit with respect to the result C(man) for normalization process, "1" is added to the exponent C(exp) defined in Step S2. On the contrary, if it is necessary to execute the left shift by n bit with respect to the result C(man) for normalization process, "n" is subtracted from the exponent C(exp) defined in Step S2.
7 Step S7 . . . it is determined whether or not exception processing regarding the exponent C(exp) calculated in Steps S2, S6a and S6b is necessary.
8 Step S8a . . . in the case where an underflow occurs in the exponent C(exp), or where a value of the exponent C(exp) becomes "0", all parts {C(man) and C(exp)} of the result of the two data A, B data are set at "0".
9 Step S8b . . . in the case where an overflow occurs in the exponent C(exp), if a value of the exponent C(exp) is a negative number, the result of the addition of the two numbers A, B is set at the most negative value. On the other hand, if a value of the exponent C(exp) is a positive number, the result of the addition thereof is set at the most positive value.
10 Step S9 . . . in the case where the result of the addition of the two numbers A and B has a normal value, this result is indicated as C (-A+B) represented by C(man) and C(exp).
(I), (II) and (III) of FIG. 3 show examples in which adding and subtracting calculations of floating-point numbers are executed by the process as described above. In FIG. 3, the left portion denotes several values in which each mantissa is represented in the binary notation. On the other hand, the right portion denotes the respectively corresponding values in which the above-mentioned mantissa is represented in the decimal notation.
In a first example of (I) of FIG. 3, in order to calculate 1.50.times.2.sup.15 +1.00.times.2.sup.14, after the adjustment process of two mantissas has been performed based on 2.sup.15 where the exponent thereof has a larger value, the respective mantissas are added together. Further, a result of the addition 2.00.times.2.sup.15 is normalized and an answer is finally obtained as 2.sup.16.
In a second example of (II) of FIG. 3, in order to calculate 1.00.times.2.sup.19 +(-2.00.times.2.sup.0), after the adjustment process of two mantissas has been performed based on 2.sup.19 where the exponent thereof has a larger value, the respective mantissas are added together. Further, a result of the addition (1-2.sup.-18).times.2.sup.19 is normalized and an answer is finally obtained as (2-2.sup.-17).times.2.sup.18.
In a third example of (III) of FIG. 3, in order to calculate (2-2.sup.-17).times.2.sup.-20 -(2-2.sup.-16).times.2.sup.-20, the respective mantissas are added together, and a result becomes 1.00.times.2.sup.-37. In other words, this value 1.00.times.2.sup.-37 means that an overflow occurs, and therefore all parts of the result are set at "0".
Further, some examples of the concrete construction of arithmetic systems using floating-point representation according to the prior art will be described, with reference to FIGS. 4 to 7.
FIG. 4 is a block diagram showing the construction of a first example of two-input-type floating-point adder-subtractors constituting the main part of a floating-point arithmetic system according to the prior art.
In FIG. 4, the construction of two-input-type floating-point adder-subtractors is illustrated in the case where addition and subtraction calculations of a large number of floating-point data (n kinds of floating-point data; n denotes an integer) are executed by utilizing a plurality of two-input-type floating-point adder-subtractors of the prior art. In this case, typically, a large number of two-input-type floating-point adder-subtractors 81-1, 81-2, . . . , 81-n are connected in cascade, for example, in a binary tree form (the number of whole stages is log.sub.2 n).
To be more specific, an addition and substraction calculation of data D0 and D1 is executed by an adder-subtractor 81-1. Further, an addition and substraction calculation of data D2 and D3 is executed by on adder-substractor 81-5. Further, an addition and substraction calculation of an output from the adder-subtractors 81-1 and an output from the adder-substractor 81-5 is executed by a adder-subtractors 81-2, and then an addition and substraction calculation of an output from the adder-subtractor 81-2 and an output from other adder-substractor is executed by an adder-subtractors 81-3. Further, addition and substraction calculations are executed in a similar manner repeatedly, so that accumulative addition and subtraction calculations can be performed and a result of the calculations can be finally output from an adder-substractor 81-4.
FIG. 5 is a block diagram showing the construction of a second example (pipe-line type) of two-input-type floating-point adder-subtractors constituting the main part of a floating-point arithmetic system according to the prior art.
In FIG. 5, as often utilized in supercomputers, an arithmetic system of a pipe-line type is provided, in which the arithmetic operations are subdivided into a lot of process and each process is executed independently and sequentially, as if each process were a assembly line, in order to perform arithmetic operations at extremely high speed. To be more concrete, a two-input-type floating-point adder-subtractor 82, which is composed of a plurality of stages (k stages) divided in advance, is utilized. For example, in the case where accumulative addition and subtraction calculations regarding multiple-input data x.sub.1, x.sub.2, . . . , and x.sub.n are performed, the intermediate results, which are output from the two-input-type floating-point adder-subtractor 82, are returned to an input portion thereof, so that arithmetic operations of multiple-input data x.sub.1, x.sub.2, . . . , and x.sub.n are performed.
FIGS. 6 and 7 are block diagrams showing the constructions of first and second examples of multiplying and adding calculation (MAC) systems according to the prior art, including at least one floating-point adder-subtractor of the two-input-type.
In FIG. 6, a plurality of two-input-type floating-point adder-subtractors 91-6, 91-7 are connected in cascade. Further, by adequately combining these two-input-type floating-point adder-subtractors 91-6, 91-7, a plurality of multipliers 91-1, 91-3 and 91-5, and a plurality of delay units 91-2, 91-4, a multiplying and adding calculation system for executing an equation Y(z)=(W0+W1*.sup.-1 +W2*z.sup.-2) D(z) can be constituted. This construction is a so-called multiplying and adding calculation system of an array type, in which an arithmetic calculation device is adapted to execute arithmetic operations of multiple-input data (three input data in FIG. 6) without a control by a program of computers by sequentially operating these two-input-type floating-point adder-subtractors as the main constituents.
On the other hand, in FIG. 7, a pipe-line register 92-4 is connected to an output terminal of a two-input-type floating-point adder-subtractor 92-3. Further, another pipe-line register 92-2 is connected to an output terminal of an multiplier 92-1, so that a multiplying and adding calculation system for executing the equation Y(z)=(W0+W1*z.sup.-1 +W2*z.sup.-2) D(z) same as that shown in FIG. 6 can be constructed. This construction is a so-called multiplying and adding calculation system of a pipe-line type. Also, in such a calculation system, arithmetic operations on multiple-input data are executed by utilizing software of a program of computers, by means of such a two-input-type floating-point adder-subtractor 92-3. Further, in FIG. 7, the given input data Di is input via the corresponding delay unit, where the data Di is represented as Di=z.sup.-i *D(z).
As described above, in a first example of a multiplying and adding calculation system according to the prior art (FIGS. 4 and 6) of an array type, a large number of two-input-type floating-point adder-subtractors have to be connected in cascade in order to perform arithmetic operations on multiple-input data. Therefore, a problem occurs that the necessary amount of hardware is likely to increase with an increase of the number of the input data.
Moreover, in such a multiplying and adding calculation device of this type, the arithmetic operations in the floating-point representation are executed by utilizing the construction in which a plurality of two-input-type adder-subtractors are arranged in a tree or array form. In such a construction, the process for adjusting the respective digit positions of mantissas of multiple-input data based on a result of comparison of exponents of the input data, the normalization process, and the like, which are essential for the arithmetic operations in the floating-point representation, are likely to be executed in such a manner that each process is often repeated among a plurality of two-input-type arithmetic units and is adequately dispersed among them. Therefore, another problem occurs that it becomes difficult for the arithmetic operations to be performed at very high speed.
On the other hand, in a second example of a multiplying and adding calculation system according to the prior art of a pipe-line type (FIGS. 5 and 7), a troublesome correcting process of regarding the intermediate results must be executed in accordance with the number of stages of the pipe-line in an adder-substractor. Therefore, still another problem occurs that a relatively long time is required for completing the arithmetic operations, especially, the accumlative addition and substraction calculations.
In general, in such a pipe-line type arithmetic system, as seen in supercomputers, the process at each stage of a pipe-line composed of plural stages is made as simple as possible. By means of such a simplified process at each stage, a method of arithmetic operations, in which the time required for passing each stage is shortened and the number of the whole stages necessary for pipe-line type arithmetic operations (the length of a vector) increases, is usually adopted. When the accumulative adding and subtracting calculations are executed at the speed based on a pipe-line pitch determined by the above-mentioned length of a vector, a partial sum, which is obtained as a result of jump addition executed in the boundary between adjoining stages, is stored in each stage in a dispersed condition. Consequently, still another problem occurs that it becomes necessary for a troublesome supplementary process in the last procedure of arithmetic operations to be executed, in order to treat such a partial sum.
Further, more detailed description of the problem, concerning the operational speed in the first and second examples of the prior art, will be made with reference to FIGS. 8 and 9.
In FIG. 8, two processors 110, 111 have arithmetic units including two-input-type floating-point adder-subtractors in a cascade connection; and accumulators 120, 121 each temporarily storing results of arithmetic operations. Further, registers (abbreviated to "REG" in FIG. 8) 100, 101 for storing input data such as Y(k) for short time are provided corresponding to the processors 110, 111, respectively. In this case, it is assumed that the input data is continuously transferred from the left portion. Further, in this case, it is assumed that the arithmetic operations of the equation Z(k)=W2*X2+W1*X1+Y(k) are executed by utilizing the above-mentioned two processors. The detailed process of the arithmetic operations is as follows.
First, at the time T0, a result of W1*X1 calculated by a first processor 110 is stored in an accumulator 120, and simultaneously data Y(k) is input to a first register 100.
Next, at the time T1, the data Y(k) is input from a first register 100 to a first processor 110, and this processor 110 executes arithmetic operations of W1*X1+Y(k) by adding Y(k) to the content of the accumulator.
Also, at the time T1, a result NA of the arithmetic operations of W1*1+Y(k) is returned to the first register 100, and is transferred from the first register 100 to a second register 101.
At this time, a second processor 111 executed the arithmetic operations of W2*X2. A result calculated by a second processor 111 is stored in an accumulator 121. Further, at the time T2, the result NA is received from the second register 101 by the second processor 111, and this processor 111 executes arithmetic operations of W2*X2+NA by adding Y(k) to the content of the accumulator.
Further, a result NB of the arithmetic operations of Z(k)=W2*X2+NA is returned to the second register 101, and all the arithmetic operations of Z(k)=W2*X2+W1*X1+Y(k) are completed.
In such a process as described above with reference to FIG. 8, the accumulative addition and subtraction calculations of multiple-input data are executed by adequately dispersing them between two processors, in order to realize the method in which, after the content of a process in one arithmetic unit has been completed, the content of next process is started. Due to such a method, the time, that it takes to complete the arithmetic operations after the input of data is started, is increased as the number of input data is increased and the amount of necessary hardware is also increased.
To treat this advantage, it is deemed to be reasonable, that the time interval for the input of data is shortened and that the overall operational time is reduced. In view of this, the case where the same arithmetic operations as those in FIG. 8 are executed by utilizing an arithmetic system of a pipe-line type, will be explained with reference to FIG. 9.
In FIG. 9, two processors 140, 141, including two states of two-input-type floating-point adder-subtractors of the pipe-line type (the same type as in FIG. 5), and pipe-line registers 150, 151 within the arithmetic system, are illustrated. Further, registers (abbreviated to "REG" in FIG. 9) 130, 131 for storing input data such as Y(k) for short time are provided corresponding to the processors 140, 141, respectively.
In this case, a process of the arithmetic operations of Z(k)=W2*X2+W1*X1+Y(k) is executed only by a processor 140, while another process of the arithmetic operations of Z(k)=W4*X4+W3*X3+Y(k) is executed only by another processor 141. Accordingly, it seems that the arithmetic operations in FIG. 9 can be performed at higher speed than those in FIG. 8.
However, in FIG. 9, when the arithmetic operations are executed by the two processors 140, 141, it is necessary that, after the given data Y(k) is added to the data stored in the respective registers 130, 131, a result of the addition should be returned to the respective registers 130, 131. Therefore, even though each processor receives input data from the corresponding register and executes the required arithmetic operations, the processor cannot return the result of the arithmetic operations to the register until the time corresponding to two stages (T0-T2) has elapsed.
In other words, since data transfer speed of the registers is determined by the time required for completion of arithmetic operations of these arithmetic units of two stages (the latency time), the overall operational speed cannot be made higher than a value based on the above-mentioned time of these arithmetic units. Therefore, in the construction such that the data are input to the register or the data are output therefrom, it cannot be expected to increase operational speed based on the effect of a pipe-line, even when the processor of a pipe-line type is utilized.