1. Field of the Invention
The present invention relates to microprocessors, and more particularly to an x86 microprocessor architecture supporting MMX extensions in which coherency between separate floating point and MMX register files is provided.
2. Description of the Related Art
Since their introduction, x86 microprocessors have become nearly ubiquitous in computer applications. The original x86 instruction set included only scalar and integer instructions, executed by a scalar integer execution unit. Later, floating point instructions were added and eventually a floating point execution unit was included within the x86 microprocessor architecture for executing the floating point instructions. Recently, the x86 microprocessor architecture has been extended to include a new technology called MMX. MMX technology is a set of 57 new instructions added to the x86 architecture to speed up multimedia operations. For example, with one MMX instruction, 4 pairs of 16-bit numbers can be added, subtracted or multiplied at the same time. With the addition of MMX technology, an MMX execution unit has been incorporated within the x86 microprocessor architecture.
FIG. 1 illustrates a conventional x86 microprocessor 100 that includes MMX technology extensions. As shown, microprocessor 100 includes a bus interface 102, an instruction cache 104, an instruction fetch/translate unit 106, a microcode decode unit 108, a scalar integer unit 110, a floating point unit (FPU) 112, an MMX unit 114, and a data cache 116. Bus interface 102 handles reading and writing data and instructions between instruction cache 104, data cache 116 and external instruction and data memories available on an external processor bus. Cached x86 instructions are clocked out of instruction cache 104 by instruction fetch/translate unit 106, which then translates the fetched x86 instructions into processor microcode. The microcode is decoded by microcode decode unit 108 for execution by one of scalar integer unit 110, FPU 112, and MMX unit 114. Data operated upon by execution units 110, 112, 114 is locally stored in associated operand register files (not shown) and then is read from and written to memory via data cache 116.
In the x86 microprocessor architecture that includes MMX extensions, both floating point instructions and MMX instructions have operands that reference floating point registers in a floating point register file. If only a single FP register file is provided for both floating point and MMX instructions, however, to allow both FPU 112 and MMX unit 114 to have access to the FP register file, the FP register file would have to have four read ports and four write ports (two for the FPU, and two for the MMX unit). In addition, significant space on the processor would be required to connect both the FPU and the MMX unit to the one FP register file. Just adding the MMX unit connections would require 4xc3x9764, or 256 additional wires. Having this many connections provided to one register file is too costly in terms of chip space. For example, an additional 4xc3x9764 wires running between different execution units would require at least (4xc3x9764)xc3x97(1.2 microns (drawn dimension) in height)xc3x97(the length of the wires) in total chip area. One solution to this problem is to provide a second register file for MMX operations.
FIG. 2 shows a portion of an x86 processor having such separately provided FP and MMX register files. As shown, FPU 112 is connected to FP register file 220 having a plurality n of registers 224-1 to 224-n, and MMX unit 114 is connected to MMX register file 222 having a plurality n of registers 226-1 to 226-n. When FPU 112 receives a decoded FP instruction from microcode decode unit 108 that contains an operand that references one of registers 224-1 to 224-n, it executes the instruction with the contents of the referenced FP operand register. Likewise, when MMX unit 114 receives a decoded MMX instruction from microcode decode unit 108 that contains an operand that references one of registers 226-1 to 226-n, it executes the instruction with the contents of the referenced MMX operand register. If the executed FP or MMX instruction changes the contents of the referenced operand register, the FP or MMX unit writes the modified contents back to the register.
However, even though separate register files 220, 222 are provided, the x86 architecture requires that the two register files 220, 222 be treated as one. That is, the data in both of the register files 220, 222 must be coherent (i.e., the contents of FP register 224-1 must be coherent with the contents of MMX register 226-1, the contents of FP register 224-2 must be coherent with the contents of MMX register 226-2, and so on for each of the n registers in FP register file 220 and MMX register file 222). Accordingly, when the FP or MMX unit executes an instruction that modifies the contents of one of the registers in register files 220, 222, such modified contents must be reflected in the corresponding register in the other of register files 220, 222.
Tracking mechanisms could be used to cause a write to either register file to also cause a write to the other register file. Other mechanisms for maintaining coherency could require hundreds of processor cycles any time a context shift (from FP to MMX, or vice versa) occurs. More specifically, one could require that any time a coherency problem exists (i.e., when an executed FP or MMX instruction causes the contents of a register to be modified), the contents of all of the registers in a modified register file are copied to the other register file. This would effect the number of clock cycles required to maintain coherency. Co-pending application Ser. No. 09/349,441 (IDT 1428) solved the problem of efficiently tracking coherency between separate FP and MMX register files in an x86 processor so as to reduce to a minimum the number of times such copy operations are performed.
Still, copying the contents of all of the n registers in FP register file 220 to the corresponding n registers in MMX register file 222, or vice versa, any time such copy operations are required, is time consuming. In addition, although FP registers are 80 bits wide (64 bits for mantissa and 16 bits for exponent), MMX instructions deal only with the 64-bit mantissa portion. Providing a separate 8xc3x9780-bit MMX register file, for example, therefore incurs 8xc3x9716 bits of wasted space. But, if only an 8xc3x9764 bit MMX register file is separately provided, when copying contents of registers from MMX register file 222 to FP register file 220, the FP registers corresponding to MMX registers that have been changed are to have FFFF (hex) in the exponent, while those FP registers corresponding to MMX registers that have not been changed should not have the exponent portion altered. The setting of the exponent of changed FP registers is required by the Intel architecture. Accordingly, some tracking mechanism on the MMX side to determine which of the MMX registers were actually changed is necessary.
Accordingly, there remains a need in the art for reducing the time needed to maintain coherency between separate FP and MMX register files, while insuring that the architecturally required FFFF (hex) value is filled in the exponent portion of FP registers corresponding to modified MMX registers only. The present invention fulfills this need.
An object of the invention is to improve the time required to maintain coherency between the contents of separately provided MMX and FP register files.
Another object of the invention is to insure that the contents of FP registers corresponding to modified MMX registers, when moved to the FP register file, have the architecturally required FFFF (hex) value in the exponent portion.
Another object of the invention is to reduce the chip space required to separately provide MMX and FP register files.
The present invention fulfills these objects, among others, by providing a write control unit that monitors writes to the MMX register file and a status register that is updated accordingly. The write control unit uses the contents of the status register to control transfers of register contents between the MMX register file and the FP register file, so as to only copy those registers that have changed.
According to another aspect of the invention, the write control unit insures that architecturally required modifications to the exponent portion of FP registers corresponding to modified MMX registers are provided.