1. Field of the Invention
This invention relates to the field of microprocessors and, more particularly, to the dispatching of floating point exchange instructions within microprocessors.
2. Description of the Related Art
Superscalar microprocessors achieve high performance by executing multiple instructions per clock cycle and by choosing the shortest possible clock cycle consistent with the design. As used herein, the term xe2x80x9cclock cyclexe2x80x9d refers to an interval of time accorded to various stages of an instruction processing pipeline within the microprocessor. Storage devices (e.g. registers and arrays) capture their values according to the clock cycle. For example, a storage device may capture a value according to a rising or failing edge of a clock signal defining the clock cycle. The storage device then stores the value until the subsequent rising or falling edge of the clock signal, respectively. The term xe2x80x9cinstruction processing pipelinexe2x80x9d is used herein to refer to the logic circuits employed to process instructions in a pipelined fashion. Generally speaking, a pipeline comprises a number of stages at which portions of a particular task are performed. Different stages may simultaneously operate upon different items, thereby increasing overall throughput. Although the instruction processing pipeline may be divided into any number of stages at which portions of instruction processing are performed, instruction processing generally comprises fetching the instruction, decoding the instruction, executing the instruction, and storing the execution results in the destination identified by the instruction.
Microprocessors are configured to operate upon various data types in response to various instructions. For example, certain instructions are defined to operate upon an integer data type. The bits representing an integer form the digits of the number. The decimal point is assumed to be to the right of the digits (i.e. integers are whole numbers). Another data type often employed in microprocessors is the floating-point data type. Floating point numbers are represented by a significand and an exponent. The base for the floating point number is raised to the power of the exponent and multiplied by the significand to arrive at the number represented. While any base may be used, base 2 is common in many microprocessors. The significand comprises a number of bits used to represent the most significant digits of the number. Typically, the significand comprises one bit to the left of the binary, and the remaining bits to the right of the binary. The bit to the left of the binary point is not explicitly stored, instead it is implied in the format of the number. Generally, the exponent and the significand of the floating point number are stored. Additional information regarding the floating point numbers and operations performed thereon may be obtained in the Institute of Electrical and Electronic Engineers (IEEE) standard 754.
Floating point numbers can represent numbers within a much larger range than can integer numbers. For example, a 32 bit signed integer can represent the integers between 2xe2x88x92xe2x88x921 and xe2x88x922xe2x88x92, when two""s complement format is used. A single precision floating point number as defined by IEEE 754 comprises 32 bits (a one bit sign, 8 bit biased exponent and 24 bits of significand) and has a range from 2xe2x88x92126 to 2127 in both positive and negative numbers. A double precision (64 bit) floating point value has a range from 2xe2x88x921022 and 21023 in both positive and negative numbers. Finally, an extended precision (80 bit) floating point number has a range from 2xe2x88x9216382 to 216383 in both positive and negative numbers.
The expanded range available using the floating point data type is advantageous for many types of calculations in which large variations in the magnitude of numbers can be expected, as well as in computationally intensive tasks in which intermediate results may vary widely in magnitude from the input values and output values. Still further, greater precision may be available in floating point data types than is available in integer data types.
Floating point data types and floating point instructions produce challenges for the microprocessor designer. Floating point instructions are typically executed by a specialized unit designed to perform floating point operations. Accordingly, the microprocessor must identify floating point instructions and dispatch those instructions to a floating point instruction unit Floating point instruction units are typically designed to execute one floating point instruction at a time.
Floating point instructions are typically stack based instructions. The instructions are designed to operate on data stored on the top of a register stack. Because each instruction uses the top-of-stack register, register dependencies exist between floating point instructions and the floating point instructions must be executed in a serial fashion. When a register other than the top of the register stack is the desired operand for a floating point instruction, a floating point exchange (FXCH) instruction is executed. The floating point exchange instruction exchanges the contents of a specified floating register with the contents of the top-of-stack register. The floating point instruction is then executed using the top-of-stack register. Unfortunately, the execution of a floating point instruction on a register other than the top-of-stack requires two floating point instructions. As mentioned above, only one floating point instruction is typically executed per clock cycle. Accordingly, executing a floating point instruction on a register other than the top-of-stack register requires at least two clock cycles to perform.
The problems outlined above are in large parts solved by the dispatch of floating point exchange instructions in accordance with the present invention. A predecode unit detect a floating point exchange instruction followed by a floating point instruction. The predecode unit marks the two instructions as one combined instruction. In one embodiment the predecode unit marks the combined floating point instruction as a microcode instruction. The microprocessor routes all microcode instructions to a microcode unit. The microcode unit determines on which register to perform the floating point instruction and dispatches the floating point instruction and a register field identifying the register to exchange with the top-of-stack to the floating point wilt. In this manner, a floating point exchange instruction followed by a floating point instruction using a stack register are dispatched to the floating point instruction unit as one instruction. Accordingly, the execution of a floating point exchange instruction followed by a floating point instruction maybe accomplished in one clock cycle.
If a floating point exchange instruction cannot be paired with another floating point instruction, then the floating point exchange instruction is executed as a separate instruction. If a branch instruction branches to a floating point instruction predecoded as part of a combined floating point exchange instruction and floating point instruction, an invalid instruction is detected and the floating point instruction is predecoded as a separate instruction.
Broadly speaking, the present invention contemplates a circuit for executing floating point exchange instructions including a decode unit and a floating point unit. The decode unit is configured to detect a floating point exchange instruction followed by a floating point instruction using a stack register. The floating point unit is coupled to the decode unit and is configured to convey an opcode of the floating point instruction using a stack register and exchange register information to the floating point unit. The exchange register information identifies a first floating point register to exchange with a second floating point register and the floating point unit performs the exchange prior to executing the floating point instruction using a stack register.
The present invention further contemplates a method for executing floating point exchange instructions including: detecting a floating point exchange instruction followed by a floating point instruction using a stack register, dispatching an opcode of the floating point instruction using a stack register and exchange register information to a floating point unit, wherein the exchange register information identifies a first floating point register to exchange with a second floating point register, exchanging the floating point registers identified by the exchange register information; and executing the floating point instruction using a stack register.
The present invention still firer contemplates a microprocessor including an instruction cache, an instruction alignment unit coupled to the instruction cache, a decode unit coupled to the instruction alignment unit, a functional unit coupled to the decode unit; and a floating point unit coupled to the decode unit. The decode unit is configured to detect a floating point exchange instruction followed by a floating point instruction using a stack register. The floating point unit is coupled to the decode unit and is configured to convey an opcode of the floating point instruction using a stack register and exchange register information to the floating point unit. The exchange register information identifies a first floating point register to exchange with a second floating point register and the floating point unit performs the exchange prior to executing the floating point instruction using a stack register.