The present invention relates in general to data processing systems, and in particular, to vector arithmetic operations in a data processor.
Vector processing extensions to microprocessor architectures are being implemented to enhance microprocessor performance, particularly with respect to multimedia applications. One such vector processing extension is the Vector Multimedia Extension (VMX) to the PowerPC microprocessor architecture (xe2x80x9cPowerPCxe2x80x9d is a trademark of IBM Corporation.) VMX is a single instruction multiple data (SIMD) architecture. In a SIMD architecture, a single instruction operates on multiple sets of operands. For example, an instruction having thirty-two bit operands may operate on the operands in bytewise fashion as four eight-bit operands, as sixteen bit half-word operands, or as word length operands of thirty-two bits.
Integer arithmetic instructions may have both modulo, that is, wrap around, and saturating modes. The mode determines the result of the operation implemented by the instruction when the result overflows the result field, either a byte-length field, a half-word-length field, or a word length field, depending on the data type being operated on by the instruction. In modulo mode, the result truncates an overflow or underflow for the length (byte, half-word, or word) and type of operand (signed or unsigned). In saturating mode, the result is clamped to its saturated value, the smallest or largest representable value in the field.
To implement these instructions, three tasks need to be performed. An intermediate result is produced, using a single adder, in accordance with the specific instruction being executed. It is then determined if the intermediate result fits into the field corresponding to the length of the operand. Then, the appropriate result must be selected, either the intermediate result, the truncated overflow or underflow, if in modulous mode, or the saturation value, if in saturating mode.
The task of determining if an intermediate result fits into its field, and the task of selecting the appropriate value as a final result may be complicated and time consuming. In particular, these tasks are complicated in that the instructions support different data types, that is, instruction operands having different lengths, as described hereinabove, each of which may be either signed or unsigned. Consequently, it becomes difficult to meet cycle time requirements if the three tasks are performed sequentially.
Thus, there is a need in the art for apparatus and methods for implementing vector integer arithmetic instructions, which are sufficiently fast to meet cycle time requirements. In particular, there is a need in the art for performing, in parallel, the tasks of generating an intermediate result, determining if the intermediate results fits into a preselected field, and selecting a mode dependent result value.
The aforementioned needs are addressed by the present invention. Accordingly, there is provided, in a first form, a saturation select apparatus. The apparatus includes a plurality of logic circuits, in which each circuit of said plurality is operable for outputting a corresponding bit of an n-bit result signal. Each logic circuit outputs said corresponding bit in response to first and second input signals, and first and second control signals, wherein said first control signal is asserted in response to a first saturated instruction, and said second control signal is asserted in response to a second saturated instruction.
There is also provided, in a second form, a data processing system. The system includes a central processing unit (CPU), and a memory operable for communicating instructions and operand data to said CPU. The CPU includes an execution unit operable for executing said instructions, in which the execution unit contains saturation select circuitry including a plurality of logic circuits. Each circuit of said plurality is operable for outputting a corresponding bit of an n-bit result signal, wherein each logic circuit outputs said corresponding bit in response to first and second input signals, and first and second control signals. The first and second input signals are generated in response to said operand data. The first control signal is asserted in response to a first saturated instruction, and said second control signal is asserted in response to a second saturated instruction.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention.