1. Field of the Invention
The present invention generally relates to computer systems and, more particularly, to a method of processing denormalized floating-point numbers.
2. Description of the Related Art
The basic structure of a conventional computer system 10 is shown in FIG. 1. The heart of computer system 10 is a central processing unit (CPU) or processor 12 which is connected to several peripheral devices, including input/output (I/O) devices 14 (such as a display monitor and keyboard) for the user interface, a permanent memory device 16 (such as a hard disk or floppy diskette) for storing the computer's operating system and user programs, and a temporary memory device 18 (such as random-access memory or RAM) that is used by processor 12 to carry out program instructions. Processor 12 communicates with the peripheral devices by various means, including a bus 20 or a direct channel 22. Computer system 10 may have many additional components which are not shown, such as serial and parallel ports for connection to, e.g., modems or printers. Those skilled in the art will further appreciate that there are other components that might be used in conjunction with those shown in the block diagram of FIG. 1; for example, a display adapter connected to processor 12 might be used to control a video display monitor, and a memory controller may be used as an interface between temporary memory device 18 and processor 12. Computer system 10 also includes firmware 24 whose primary purpose is to seek out and load an operating system from one of the peripherals (usually permanent memory device 16) whenever the computer is first turned on.
A processor can perform arithmetic operations on different types of numbers, or operands. For example, the simplest operations involve integer operands, which are represented using a "fixed-point" notation. Non-integers are typically represented according to a "floating-point" notation. Standard number 754 of the Institute of Electrical and Electronics Engineers (IEEE) sets forth particular formats which are used in most modern computers for floating-point operations. For example, a "single-precision" floating-point number is represented using a 32-bit (one word) field, and a "double-precision" floating-point number is represented using a 64-bit (two-word) field. Most processors handle floating-point operations with a floating-point unit (FPU).
Floating-point notation (which is also referred to as exponential notation), can be used to represent both very large and very small numbers. A floating-point notation has three parts, a mantissa (or significand), an exponent, and a sign (positive or negative). The mantissa specifies the digits of the number, and the exponent specifies the magnitude of the number, i.e., the power of the base which is to be multiplied with the mantissa to generate the number. For example, using base 10, the number 28330000 would be represented as 2833E+4, and the number 0.054565 would be represented as 54565E-6. Since processors use binary values, floating-point numbers in computers use 2 as a base (radix). Thus, a floating-point number may generally be expressed in binary terms according to the form EQU n=(-1).sup.S .times.1.F.times.2.sup.E,
where n is the floating-point number (in base 10), S is the sign of the number (0 for positive or 1 for negative), F is the fractional component of the mantissa (in base 2), and E is the exponent of the radix. In accordance with IEEE standard 754, a single-precision floating-point number uses the 32 bits as follows: the first bit indicates the sign (S), the next eight bits indicate the exponent offset by a bias amount of 127 (E+bias), and the last 23 bits indicate the fraction (F). So, for example, the decimal number ten would be represented by the 32-bit value EQU 0 10000010 01000000000000000000000
as this corresponds to (-1).sup.0 .times.1.01.sub.2 .times.2.sup.130-127 =1.25.times.2.sup.3 =10.
When a value is expressed in accordance with the foregoing convention, it is said to be normalized, that is, the leading bit in the significand is nonzero, or a "1" in the case of a binary value (as in "1.F"). If the explicit or implicit most significant bit is zero (as in "0.F"), then the number is said to be unnormalized. Unnormalized numbers can easily occur as an output result of a floating-point operation, such as the effective subtraction of one number from another number that is only slightly different in value. The fraction is shifted left (leading zeros are removed from the fraction) and the exponent adjusted accordingly; if the exponent is greater than or equal to E.sub.min (the minimum exponent value), then the result is said to be normalized. If the exponent is less than E.sub.min, an underflow has occurred. If the underflow is disabled, the fraction is shifted right (zeros inserted) until the exponent is equal to E.sub.min. The exponent is replaced with "000" (hexadecimal), and the result is said to be denormalized. For example, two numbers (having the same small exponent E) may have mantissas of 1.010101 and 1.010010, and when the latter number is subtracted from the former, the result is 0.000011, an unnormalized number. If E&lt;5, the final result will be a denormalized number.
The hardware of many conventional computers is adapted to process only normalized numbers. Therefore, when a denormalized number is presented as an output result of a floating-point operation, it must be normalized before further processing of the number can take place. Various techniques are used to normalize the values, generally by removing leading zeros from the fraction and accordingly decrementing the exponent. See U.S. Pat. No. 5,513,362. One technique involves leading zero anticipator (LZA) logic which predicts the number of zeros to remove before the floating-point arithmetic is completed. See IBM Journal of Research and Development, vol. 34, no. 1 (January 1990), pp. 71-77. Such normalization can, however, sometimes lead to an "underflow" exception. An underflow occurs when a result is presented whose absolute value is too small to be represented within the range of the numeration system. For example, in some prior-art floating-point systems, a number less than 1.0.times.2.sup.Emin cannot be represented, and a floating-point operation that results in a number smaller than this is rounded to zero. Underflows can lead to further exceptions, such as an attempt to divide by zero (an "overflow" exception). When an underflow trap is not enabled (typically the default case--also referred to as "underflow exception disabled"), the hardware is responsible for handling the underflow. Underflows may still occur, but the underflow "exception" is avoided. The exception would probably cause the job to end. When the exception is disabled, denormalized results are produced (some precision is lost), but the program keeps running.
In addition to normalizing denormalized results, i.e., removing leading zeros caused by the effective subtract operation, it is sometimes necessary to "prenormalize" input values, i.e., remove leading zeros from the source operands (A, B, and C). Prenormalization is usually required if A, B, or C is a denormalized number (a denormalized input number is changed to a number with an implicit bit equal to 1 and an exponent less than E.sub.min).
In the IEEE implementation, when the underflow trap is not enabled (the default case), numbers smaller than 1.0.times.2.sup.Emin are not rounded to zero but, instead, must be represented as a denormalized number. In other words, if full normalization creates an underflow, i.e., by making the exponent a negative value, then a "denormalization" process must occur which replaces the negative exponent with zero, and shifts the fraction back to the right (adding leading zeros) by an amount that matches the size of the negative exponent. Denormalization is often referred to as a "gradual underflow." If the hardware is not capable of producing a denormalized number, control is handed over to software which produces the denormalized number, and then the hardware resumes processing, but no exception is signalled. If the hardware can produce a denormalized number on its own, then control is not passed to the software.
The foregoing approach to handling denormalized results has several disadvantages. In those situations where an underflow would occur, the prenormalization of the value is unnecessary to begin with, and so delays processing by using extra cycles for both prenormalization and denormalization. Also, subsequent instructions in the processor must be stalled while the hardware determines if an underflow will occur, negatively impacting "pipelined" operations. Pipelining is a technique of overlapping the execution of multiple instructions, and is critical to faster CPU performance in modern computers. It would, therefore, be desirable and advantageous to devise an improved method of handling denormalized results from floating-point operations, to avoid these unnecessary delays.