1. Field of the Invention
This invention relates generally to the field of microprocessors and, more particularly, to executing floating-point store instructions in a microprocessor.
2. Description of the Related Art
Microprocessors are typically designed with a number of xe2x80x9cexecution unitsxe2x80x9d that are each optimized to perform a particular set of functions or instructions. For example, one or more execution units within a microprocessor may be optimized to perform memory accesses, i.e., load and store operations. Other execution units may be optimized to perform general arithmetic and logic functions, e.g., shifts and compares. Many microprocessors also have specialized execution units configured to perform more complex floating-point arithmetic operations including multiplication and reciprocal operations. These specialized execution units typically comprise hardware that is optimized to perform one or more floating-point arithmetic functions.
Most microprocessors must support multiple data types. For example, x86 compatible microprocessors must execute instructions that are defined to operate upon an integer data type and instructions that are defined to operate upon floating-point data types. Floating-point data can represent numbers within a much larger range than integer data. For example, a 32-bit signed integer can represent the integers between xe2x88x92231 and 231xe2x88x921 (using two""s complement format). In contrast, a 32-bit (xe2x80x9csingle precisionxe2x80x9d) floating-point number as defined by the Institute of Electrical and Electronic Engineers (IEEE) Standard 754 has a range (in normalized format) from 2xe2x88x92126 to 2127xc3x97(2xe2x88x922xe2x88x9223) in both positive and negative numbers.
Turning now to FIG. 1A, an exemplary format for an 8-bit integer 100 is shown. As illustrated in the figure, negative integers are represented using the two""s complement format 104. To negate an integer, all bits are inverted to obtain the one""s complement format 102. A constant of one is then added to the least significant bit (LSB).
Turning now to FIG. 1B, an exemplary format for a 32-bit (single precision) floating-point number is shown. A floating-point number is represented by a significand, an exponent and a sign bit. The base for the floating-point number is raised to the power of the exponent and multiplied by the significand to arrive at the number represented. In microprocessors, base 2 is typically used. The significand comprises a number of bits used to represent the most significant digits of the number. Typically, the significand comprises one bit to the left of the radix point and the remaining bits to the right of the radix point. In order to save space, the bit to the left of the radix point, known as the integer bit, is not explicitly stored. Instead, it is implied in the format of the number. Additional information regarding floating-point numbers and operations performed thereon may be obtained in IEEE Standard 754 (IEEE-754). Unlike the integer representation, two""s complement format is not typically used in the floating-point representation. Instead, sign and magnitude form are used. Thus, only the sign bit is changed when converting from a positive value 106 to a negative value 108.
Numerical data formats, such as the IEEE-754, often include a number of special and exceptional cases. These special and exceptional cases may appear in one or more operands or one or more results for a particular instruction. FIG. 2 illustrates the sign, exponent, and significand formats of special and exceptional cases that are included in the IEEE-754 floating-point standard. The special and exceptional cases shown in FIG. 2 include a zero value, an infinity value, NaN (not-a-number) values, and a denormal value. An xe2x80x98xxe2x80x99 in FIG. 2 represents a value that can be either one or zero. NaN values may include a QNaN (quiet not-a-number) value and a SNaN (signaling not-a-number) value as defined by a particular architecture. The numbers depicted in FIG. 2 are shown in base 2 format as indicated by the subscript 2 following each number. As shown, a number with all zeros in its exponent and significand represents a zero value in the IEEE-754 floating-point standard. A number with all ones in its exponent, a one in the most significant bit of its significand, and zeros in the remaining bits of its significant represents an infinity value. The remaining special and exceptional cases are depicted similarly.
Floating-point execution units are generally configured to execute floating-point store instructions. Typically, floating-point store instructions are designed to store a floating-point value, i.e. store data, to a memory location. Prior to storing a store data, however, a floating-point execution unit must examine it to ensure that it does not correspond to a value that is smaller than the minimum number that can be represented in the floating-point precision of the store data. A value that is smaller than the minimum number that can be represented in a given floating-point precision can be referred to as a tiny number.
Floating-point execution units often include an underflow mask to allow a programmer to disable an underflow exception if a tiny number is detected. If the underflow exception is masked and store data corresponds to a tiny number, a floating-point execution unit needs to ensure that the correct value is stored for a floating-point store instruction. Generating a correct value for store data that corresponds to a tiny number when the underflow exception is masked can require additional processing of the store data. The additional processing can result in an undesirable instruction latency for a floating-point store instruction. It would be desirable to reduce the instruction latencies associated with executing floating-point store instructions.
The problems outlined above are in large part solved by the use the apparatus and method described herein. Generally speaking, an apparatus and method for executing floating-point store instructions in a microprocessor is provided. If store data of a floating-point store instruction corresponds to a tiny number and an underflow exception is masked, then a trap routine can be executed to generate corrected store data and complete the store operation. In response to detecting that store data corresponds to a tiny number and the underflow exception is masked, the store data, store address information, and opcode information can be stored prior to initiating the trap routine. The trap routine can be configured to access the store data, store address information, and opcode information. The trap routine can be configured to generate corrected store data and complete the store operation using the store data, store address information, and opcode information.
The use of the apparatus and method for executing floating-point store instructions may provide performance advantages over other systems. Generally speaking, store data corresponds to a tiny number only in rare instances. By executing a trap routine to handle store data that corresponds to a tiny number, the apparatus and method may allow floating-point store instructions to execute in a more efficient manner by generating corrected store data with the trap routine. As a result, the apparatus and method may allow floating-point store instructions whose store data does not correspond to a tiny number to complete in fewer clock cycles.
Broadly speaking, a microprocessor including a floating-point execution unit, a reorder buffer, and a load/store unit is contemplated. The floating-point execution unit is configured to execute a floating-point store instruction. The floating-point store instruction specifies store data and a store address. The reorder buffer is coupled to said floating-point execution unit and includes a reorder buffer tag that corresponds to the floating-point store instruction. The load/store unit is coupled to the floating-point execution unit. The floating-point execution unit is configured to write the store data to a register in the floating-point execution unit, to determine whether the store data corresponds to a denormal value, and to convey a cancel signal to the load/store unit in response to the store data corresponding to the denormal value. The load/store unit is configured to cancel a store operation corresponding to the floating-point store instruction in response to receiving the cancel signal.
A method executing a floating-point store instruction is also contemplated. The method includes receiving the floating-point store instruction, wherein the floating-point store instruction specifies store data. The method also includes assigning a register to the floating-point store instruction, writing the store data to the register, conveying a store tag corresponding to the floating-point store instruction to a load/store unit, determining whether the store data corresponds to a denormal value, and conveying a cancel signal to the load/store unit if the store data corresponds to said denormal value.
In addition, a floating-point execution unit is contemplated. The floating-point execution unit includes a register rename unit configured to receive a floating-point store instruction, wherein said floating-point store instruction specifies store data. The floating-point execution unit also includes a scheduler coupled to the register rename unit and configured to schedule the floating-point store instruction for execution, a floating-point execution pipeline coupled to the scheduler and configured to execute the floating-point store instruction, and a control unit coupled to the scheduler and the floating-point execution pipeline. The register rename unit is configured to assign a destination register tag to the floating-point store instruction. The scheduler is configured to write the store data to a register corresponding to the destination register tag. The floating-point execution pipeline is configured to determine whether the store data corresponds to a denormal value, and the control unit is configured assert a store cancel signal in response to the store data corresponding to the denormal value.
Furthermore, a computer system comprising a microprocessor and an input/output device is contemplated. The microprocessor includes a floating-point execution unit, a reorder buffer, and a load/store unit. The floating-point execution unit is configured to execute a floating-point store instruction. The floating-point store instruction specifies store data and a store address. The reorder buffer is coupled to said floating-point execution unit and includes a reorder buffer tag that corresponds to the floating-point store instruction. The load/store unit is coupled to the floating-point execution unit. The floating-point execution unit is configured to write the store data to a register in the floating-point execution unit, to determine whether the store data corresponds to a denormal value, and to convey a cancel signal to the load/store unit in response to the store data corresponding to the denormal value. The load/store unit is configured to cancel a store operation corresponding to the floating-point store instruction in response to receiving the cancel signal. The input/output device is configured to communicate between the microprocessor and another computer system.