1. Field of the Invention
The present invention relates generally to a hardware arrangement for performing floating-point multiplication and a method of operating the same. More specifically, the present invention relates to a hardware arrangement and operation method which enables an effective reduction of execution time of floating-point multiplication with a only a small increase in relatively simple hardware.
2. Description of the Prior Art
Programs considered "typical" by the user of high performance computers are floating-point oriented. However, the time required for execution of floating-point instructions is quite long compared with the time required for the issuance of the execution instructions by an instruction unit. There is therefore a demand to reduce the execution time for floating-point calculations without an undesirable increase in complicated hardware.
Before describing the present invention, a known technique for executing floating-point multiplication will be discussed with reference to FIGS. 1A, 1B and 2.
FIGS. 1A and 1B illustrate two kinds of floating-point data formats.
The format of FIG. 1A starts with a sign bit (SF) denoting fraction, 0 being positive and 1 being negative. The second bit is a sign bit denoting exponent, 0 being positive and 1 being negative. The subsequent six bit positions are occupied by exponent E. Fraction F consists of six hexadecimal digits (24 bits) in this case. The radix point of the fraction F is assumed to be immediately to the left of the high-order fraction digit. To provide the proper magnitude for the floating-point number, the fraction is considered to be raised by a power to the base 16. The exponent is treated as an excess 64 number with a range from -64 through +63 corresponding to the binary expression of the values 0-127. Therefore, X is expressed by (-64. Se+E). A negative quantity of the fraction is indicated by 2's compliment in the FIG. 1A format.
Similarly, in FIG. 1B, the first bit is the sign bit for the number as a whole and assumes 0 or 1 for indicating the number positive or negative. The subsequent seven bit positions are occupied by exponent E. Fraction F is indicated by a true number. To provide the proper magnitude for the floating-point number, the fraction is raised by a power to the base 16.
If two normalized floating-point numbers are multiplied having the format shown in FIGS. 1A or 1B, the result can be normalized by one bit shifting to left or right.
In more detail, if each of two normalized floating-point numbers has the format shown in FIG. 1A, the result of the multiplication is normalized by one digit shifting to the right as follows. That is to say, if ##EQU1## is multiplied by ##EQU2## then the result is ##EQU3## It should be noted that the leading bit "1" in the fraction of the result does not have to be stored and is assumed to be present. The result can be normalized by one digit shifting to right as follows: ##EQU4##
Further, if two normalized data are multiplied, the minimum value of the result takes place as shown in the following cases (a) to (c); ##EQU5## is multiplied by ##EQU6## in both of the formats shown in FIGS. 1A and 1B ##EQU7## is multiplied by ##EQU8## in the format shown in FIG. 1A; and ##EQU9## is multiplied by ##EQU10## in the format shown in FIG. 1A.
The result obtained in the above-mentioned case (a) is ##EQU11## and is normalized by one digit shifting to the left as follows: ##EQU12##
Similarly, the result of each of the cases (b) and (c) is normalized by one digit shifting to the left although not shown for brevity.
In summary, in the event that two normalized operands are multiplied, a single digit shifting to the left or the right is sufficient for normalizing the result. However, in order to comply with the multiplication of unnormalized operands, although it rarely occurs, a shifter which can perform a large amount of shifting is required for normalizing the result. In other words, according to the above mentioned known technique, a shifter having the capacity to perform a large amount of shifting is essential and must be provided. Consequently, the conventional technique has encountered drawbacks in that execution time is wasted by shifting when it occurs and in that the configuration of the shifter itself is highly complicated.
The above-mentioned prior art will further be discussed with reference to FIG. 2.
FIG. 2 is a block diagram of data flows for the execution of floating-point multiplication. The execution is performed under the control of a microprogram and is functionally grouped into five stages. In the first stage, microinstructions, which are stored in the micromemory 2, are read out into to an instruction register 3. In the second stage, a multiplier and a multiplicand are derived from a suitable memory (not shown). These input operands are stored in the registers 6, 8 and then multiplied in the third stage. Following this the product is normalized in the fourth stage. The normalized result is then written into a general register 23 in the fifth and final stage.
Further description will be given in connection with the block diagram shown in FIG. 2. In the illustrated arrangement micro-instructions are sequentially derived from the micromemory 2 in response to instruction addresses applied thereto from an instruction address register 1. The microinstructions thus read out, are stored in the instruction register 3. Microinstruction registers 4, 5 are assigned to the third and fourth execution stages, respectively.
The multiplier and the multiplicand are respectively stored in registers 6, 8 in the third stage. It is assumed, in this particular embodiment, that each of the input operands has the format shown in FIG. 1B. Therefore, if the two operands are normalized before being applied to the registers 6 and 8, one digit shifting to the left is sufficient for normalization and, in the event of the occurrence of a carry, no shifting is required for normalization. On the other hand, in the event that at least one of the two input operands is unnormalized, more than two digit shifting of the product to the left is necassary for normalization. The input operands stored in the registers 6 and 8, are applied to a multiple number generator 10. A multiplier 12 receives the outputs of the multiple number generator 10 and applies partial results to registers 13 and 14. These in turn apply the partial results stored therein to an adder 15 and also back to the multiplier 12.
The adder 15 applies the output thereof to a shifter 20 via a register 17 and also to a detector 17 wherein the amount of shifting for normalization is determined. The amount of shifting, detected by the detector 17, is stored in a register 18. The shifter 20 normalizes the result applied thereto from the adder 15 by shifting it to the left according to the value stored in the register 18.
Since each fraction of the input operands consists of four hexadecimal digits in this particular embodiment, the amount of shifting to the left for normalization ranges from one to a maximum of ten digits. This means the shifter 20 must be configured to meet a wide range of shifting requirements. Accordingly, it becomes highly complex in configuration, and requires slower control clocks. The latter requirement of course degrades performance characteristics.
Moreover, in order to effectively store the result into the general register 23, a buffer register 30 has to be provided between the shifter 20 and the general register 23. This further complicates the hardware configuration.
Accordingly, as mentioned above, as most of the input operands are normalized, it is irksome to have to utilize such a large shifter in the main execution path especially when it is apt to be only infrequently used.