1. Field of the Invention
This invention relates to processors, and more specifically, to processors supporting instructions for different sized data types, such as processors supporting both single instruction single data (SISD) instructions and single instruction multiple data (SIMD) instructions.
2. Description of the Related Art
Since the introduction of the 8086 microprocessor, several successive generations of the X86 instruction set architecture, or more briefly, the X86 architecture, have been developed, with further developments occurring on a continuous basis. With each new generation of the X86 architecture, microprocessor manufacturers have attempted to maintain backward compatibility in order to allow software developed for previous generations of the architecture to run on the most current generation. Maintaining this compatibility has forced a number of compromises in successive generations of the architecture. When expanding an existing processor architecture, architects must often face several difficult choices. The expansion of an existing processor architecture may require a balancing act between maintaining backward compatibility and making the desired upgrades to increase the performance for the next generation.
Expanding an existing processor architecture may include the implementation of many architectural innovations. One method of expanding the architecture may be the addition of new instructions to the instruction set. New instructions may often require specific new types of operands. Such operands may be of various data widths, and may be compatible with data types (e.g. integer, floating point, vector, etc.) that may be operated on by the processor's execution unit(s).
Recent instruction-set architectures (ISA), and extensions thereof, have included instructions whose operands may include vector data types. These types of instructions are often referred to as SIMD (single instruction, multiple data) instructions. Examples of instruction-set architectures employing SIMD instructions are MDMX™, VIS™, MMX™, 3Dnow!™ and AltiVec™. SIMD instructions are instructions which may have operands comprising at least two sub-operands, wherein each of the sub-operands is an independent value. For example, a SIMD operand may be a 128-bit value comprising four 32-bit values. The SIMD instruction may define an operation to be performed concurrently on the sub-operands. The operation may be performed on each sub-operand independently of the other sub-operands. Typically, carry values generated by adding the sub-operands are not carried from one sub-operand to the next. An ADD instruction on 128-bit SIMD operands, each comprising four 32-bit sub-operands may result in four 32-bit addition operations. In this example, a single SIMD instruction may accomplish that which would require four different SISD instructions to accomplish. Thus, supporting SIMD instruction may allow for increased code density.
Potential performance gains may be achieved by supporting SIMD instructions in a processor. Performance gains created by the use of SIMD instructions largely result from the increased execution throughput provided by the processor's arithmetic functional units that produce multiple output data (e.g. vector output datatypes) in the same amount of time normally required to produce a single output datum. The most straightforward way to achieve these performance benefits when implementing a SIMD instruction-set in a processor, is to design the processor's functional units to be able to atomically manipulate the base data type used in these instructions. Thus, in an example in which SIMD instructions operate on 128-bit operands, the processor's functional units would be designed to operate on 128-bit wide datatypes.
For example, a processor supporting both 64-bit SISD instructions and 128-bit SIMD instructions may schedule instructions to a 128-bit functional unit. The functional unit would thus be capable of manipulating either single 64-bit operands for SISD instructions or 128-bit operands (two 64-bit suboperands) for SIMD instructions. However, this implementation leads to utilization inefficiencies. During the times in which the functional unit is operating on 64-bit datatypes, only half of the functional unit is being utilized. Only when the functional unit is operating on 128-bit datatypes is the entire functional unit fully utilized.
Thus, in a superscalar processor wider datatypes (e.g. for SIMD instructions) may be supported by widening the data path of the functional units. In order to widen the data path, additional logic may be required to be implemented, thereby consuming a significant amount of area on the processor die. The additional area consumed by widening the data path may result in the need for significant changes to the layout of the other units on the processor die. Furthermore, when narrower data types are processed (e.g. for SISD instructions), the functional units are under-utilized.