U.S. patent application Ser. No. 10/930,129 (Schwarz et al.) “Decimal Rounding Mode which Preserves Data Information For Further Rounding to Less Precision” filed Aug. 31, 2004 and incorporated herein by reference describes a new rounding mode called “round for reround” on the original arithmetic instruction in the hardware precision, and then 2) invoking an instruction which specifies a variable rounding precision and possibly explicitly sets the rounding mode which we have called the ReRound instruction. The precise result of the arithmetic operation is first truncated to the hardware format precision “p”, forming an intermediate result. If only zeros are dropped during truncation, then the intermediate result is equal to the precise result, and this result is said to be “exact”, otherwise, it is “inexact”. When the intermediate result is inexact and its least significant digit is either zero or five, then that digit is incremented to one or six respectively forming the rounded result. Thus, when the least significant digit of a rounded result is zero or five the result could be construed to be exact or exactly halfway between two machine representations if it were later rounded to one less digit of precision. For all other values, it is obvious that the result is inexact and not halfway between two machine representations for later roundings to fewer than “p” digits of precision. A nice mathematical property of this rounding mode is that results stay ordered and in a hardware implementation it is guaranteed that the incrementation of the least significant digit does not cause a carry into the next digit of the result.
In a the Schwarz application a first requirement is to create an instruction which rounds to a user specified precision which is variable, which we call the “ReRound” instruction. And the second requirement is that the original arithmetic operation in the higher precision somehow maintains information about the infinitely precise intermediate result. This information is used to prevent incorrect double rounding and enables the hardware to construct an equivalent operand, which when rounded to a smaller precision using the ReRound instruction, produces the same result as if rounding the original infinitely precise operand. Prior methods for maintaining this information about the infinitely precise result have included recording in a status word whether the rounded target is inexact and in a few cases some architectures have also provided a bit indicating whether it was rounded up. This allows rounding of a “p” digit result to a “p−1” or less digits of precision. One other method previously mentioned is to only round to only “((p/2)−1)” where “p” is the precision of the target of an arithmetic operation (i.e. 7, 16 or 34 digits depending on hardware format chosen). Choosing to limit the rounding capabilities to less than half the machine precision is severely limiting. And using the status word to maintain the additional information creates a bottleneck for performance.
The Schwarz application eliminates the performance bottleneck of updating and reading the floating-point status word of prior applications and provides the capability of secondary roundings up to “p−1” digits of precision where the first rounding was to “p” digits of precision. The mechanism for providing this information is to create a new rounding mode which maintains this information within the result of the first rounded result which was rounded to the hardware format precision. This rounding mode creates a result which will round equivalently to “p−1” digits or less of precision as the original infinitely precise result. By doing this, the extra information is contained completely within the operand and there is no bottleneck in using the floating-point status word. And given that the information is contained within the operand, multiple independent operations can be placed in between these two instructions (the original arithmetic instruction to hardware precision and the subsequent rerounding to lesser precision).
The Schwarz application provides a new rounding mode called “round for reround”. The precise result of the arithmetic operation is first truncated to the hardware format precision “p”, forming an intermediate result. If only zeros are dropped during truncation, then the intermediate result is equal to the precise result, and this result is said to be “exact”, otherwise, it is “inexact”. When the intermediate result is inexact and its least significant digit is either zero or five, then that digit is incremented to one or six respectively forming the rounded result. Thus, when the least significant digit of a rounded result is zero or five the result could be construed to be exact or exactly halfway between two machine representations if it were later rounded to one less digit of precision. For all other values, it is obvious that the result is inexact and not halfway between two machine representations for later roundings to fewer than “p” digits of precision. A nice mathematical property of this rounding mode is that results stay ordered and in a hardware implementation it is guaranteed that the incrementation of the least significant digit does not cause a carry into the next digit of the result.
An example of the problem is shown when one wishes to multiply two operands in a 16 digit hardware format but later round the answer to 15 digits in rounding mode where the operand is rounded to the nearest representable number in the target format and in case of a tie is rounded to the lower magnitude. One could also call this rounding mode round half down).
In the example, employing a decimal multiply intermediate product, say 1.23456789012344500111
If the decimal multiply were rounded toward zero the 16 digit result would be 1.234567890123445 and then applying an instruction to reround to 15 digits would yield 1.23456789012344 which is a wrong result.
U.S. Pat. No. 4,823,260 (to Imel et al.) “MIXED-PRECISION FLOATING POINT OPERATIONS FROM A SINGLE INSTRUCTION OPCODE” filed Nov. 12, 1987 and incorporated herein by reference provides for performing mixed precision calculations in the floating point unit of a microprocessor from a single instruction opcode. 80-bit floating-point registers) may be specified as the source or destination address of a floating-point instruction. When the address range of the destination indicates that a floating point register is addressed, the result of that operation is not rounded to the precision specified by the instruction, but is rounded to extended 80-bit precision and loaded into the floating point register. When the address range of the source indicates that an FP register is addressed, the data is loaded from the FP register in extended precision, regardless of the precision specified by the instruction. In this way, real and long-real operations can be made to use extended precision numbers without explicitly specifying that in the opcode.
The Intel iAPX 286/20 Numeric Data Processor (NDP) has a floating point instruction set that supports the IEEE Microprocessor Floating Point Standard P754. The NDP has eight 80-bit floating point registers which provide a capacity equivalent to forty 16-bit registers. Two 16-bit registers control and report the results of numeric instructions. A control word register defines the rounding, infinity, precision, and error-mask controls required by the IEEE standard. In order to accommodate extended-precision floating point calculations, the NDP supports 32-bit, 64-bit, and 80-bit real values. The 80-bit real values are used internally by the eight 80-bit floating point registers for extremely high precision calculations. To implement this arithmetic capability requires a separate opcode for each instruction which specifies a floating-point data type. This results in a number of separate opcodes in order to achieve all possible combinations of floating-point data types. Extra conversion instructions are necessary to convert and round the extended real result to the desired destination format with double rounding. It is desirable to reduce the number of floating point operations in order to simplify the programming and increase the performance of floating-point operations.
The Imel patent provides an apparatus for performing a number of kinds of mixed precision calculations in the floating point unit of a microprocessor utilizing a single instruction opcode.
U.S. Pat. No. 6,108,772 “METHOD AND APPARATUS FOR SUPPORTING MULTIPLE FLOATING POINT PROCESSING MODELS” filed Jun. 28, 1996 and incorporated herein by reference discloses a numerical processing method on a computer system in which an instruction having at least one operand and a type control is retrieved, and the operand is converted to a precision specified by the type control. The instruction is executed in the precision specified by the type control to obtain a result, and when the destination precision differs from the precision specified by the type control, the result is converted to the destination precision using a second instruction.
A method is needed to permit rounding decimal floating point numbers to a variable precision that results in a precise result.