1. Field of the Invention
This invention relates to the field of microprocessors and, more particularly, to the handling of rounding modes within floating point units of microprocessors.
2. Description of the Related Art
Superscalar microprocessors achieve high performance by executing multiple instructions per clock cycle and by choosing the shortest possible clock cycle consistent with the design. As used herein, the term "clock cycle" refers to an interval of time accorded to various stages of an instruction processing pipeline within the microprocessor. Storage devices (e.g. registers and arrays) capture their values according to the clock cycle. For example, a storage device may capture a value according to a rising or falling edge of a clock signal defining the clock cycle. The storage device then stores the value until the subsequent rising or falling edge of the clock signal, respectively. The term "instruction processing pipeline" is used herein to refer to the logic circuits employed to process instructions in a pipelined fashion. Generally speaking, a pipeline comprises a number of stages at which portions of a particular task are performed. Different stages may simultaneously operate upon different items, thereby increasing overall throughput. Although the instruction processing pipeline may be divided into any number of stages at which portions of instruction processing are performed, instruction processing generally comprises fetching the instruction, decoding the instruction, executing the instruction, and storing the execution results in the destination identified by the instruction.
Microprocessors are configured to operate upon various data types in response to various instructions. For example, certain instructions are defined to operate upon an integer data type. The bits representing an integer form the digits of the number. The decimal point is assumed to be to the right of the digits (i.e. integers are whole numbers). Another data type often employed in microprocessors is the floating point data type. Floating point numbers are represented by a significand and an exponent. The base for the floating point number is raised to the power of the exponent and multiplied by the significand to arrive at the number represented. While any base may be used, base 2 is common in many microprocessors. The significand comprises a number of bits used to represent the most significant digits of the number. Typically, the significand comprises one bit to the left of the decimal, and the remaining bits to the right of the decimal. The bit to the left of the decimal is not explicitly stored, instead it is implied in the format of the number. Generally, the exponent and the significand of the floating point number are stored. Additional information regarding the floating point numbers and operations performed thereon may be obtained in the Institute of Electrical and Electronic Engineers (IEEE) standard 754.
Floating point numbers can represent numbers within a much larger range than can integer numbers. For example, a 32 bit signed integer can represent the integers between 2.sup.31 -1 and -2.sup.31, when two's complement format is used. A single precision floating point number as defined by IEEE 754 comprises 32 bits (a one bit sign, 8 bit biased exponent, and 24 bits of significand) and has a range from 2.sup.-126 to 2.sup.127 in both positive and negative numbers. A double precision (64 bit) floating point value has a range from 2.sup.-1022 and 2.sup.1023 in both positive and negative numbers. Finally, an extended precision (80 bit) floating point number has a range from 2.sup.-16382 to 2.sup.16383 in both positive and negative numbers.
The expanded range available using the floating point data type is advantageous for many types of calculations in which large variations in the magnitude of numbers can be expected, as well as in computationally intensive tasks in which intermediate results may vary widely in magnitude from the input values and output values. Still further, greater precision may be available in floating point data types than is available in integer data types.
Floating point data types produce challenges for the microprocessor designer. For example, an arithmetic operation between two floating point numbers may produce a value which is within the floating point numerical range, but cannot be exactly represented within the floating point data type format. Therefore, the result must be rounded to a representable number. Generally speaking, rounding refers to selecting a number which is representable in the target data type format (e.g. single, double, or extended precision) to be the result of a calculation in place of the exact result when the exact result is not representable in the target data format. The rounding may be accomplished in a number of ways. For example, the result can be truncated to fit in the target data format. Alternatively, the nearest representable number to the actual result may be chosen (whether that number is numerically higher or lower than the actual result). Additional alternative rounding modes include rounding up to a numerically larger number, or rounding down to a numerically smaller number. Many other types of rounding modes may be used, including rounding to a lesser precision (i.e. fewer bits of significand), etc.
Instead of choosing only one rounding mode, which may not serve the needs of all users, microprocessors typically allow the user to select the rounding mode. A control word is defined for the microprocessor, and a field within the control word comprises the rounding mode. A special control word update instruction is provided to allow the user to update the control word, including the rounding mode. Generally speaking, the control word stores a number of fields regarding the operating state of the floating point unit. A precision control field may be included, indicating the precision of the results (single, double, or extended). Additionally, exception masking may be stored in the control word. Floating point calculations can produce a variety of exceptions (i.e. conditions which may require corrective action such as discarding the instruction stream and refetching or trapping to a software routine for diagnosis). If a particular exception is unimportant to a particular user, then that exception can be masked in the control word.
Since changing the control word can cause changes in the behavior of the floating point instructions, microprocessors typically "serialize" the pipeline upon the control word update instruction. An instruction is serialized if its execution is delayed until all previous instructions have cleared the instruction processing pipeline and all subsequent instructions are delayed until the instruction clears the instruction processing pipeline. Serialization is therefore a performance-degrading operation.
Unfortunately, certain algorithms rely on repeatedly changing the rounding mode of the floating point unit. For example, interval arithmetic is often used to compute an upper and lower bound of a correct result, given that the exact result often cannot be represented. For interval arithmetic, the rounding mode is different when computing the upper bound than when computing the lower bound. Therefore, the performance of interval arithmetic suffers because changing the rounding mode causes serialization. Other algorithms which change the rounding mode often similarly suffer performance degradation.