The present invention relates in general to data processing systems, and in particular, to vector compare and extremum (maximum/minimum) operations in a data processor.
Vector processing extensions to microprocessor architectures are being implemented to enhance microprocessor performance, particularly with respect to multimedia applications. One such vector processing extension is the Vector Multimedia Extension (VMX) to the Power PC microprocessor architecture (xe2x80x9cPower PCxe2x80x9d is a trademark of IBM Corporation.) VMX is a single instruction multiple data (SIMD) architecture. In a SIMD architecture, a single instruction operates on multiple sets of operands. For example, an instruction having thirty-two bit operands may operate on the operands in bytewise fashion as four eight-bit operands, as sixteen bit half-word operands, or as word length operands of thirty-two bits.
In vector compare operations, the xe2x80x9ctruexe2x80x9d and xe2x80x9cfalsexe2x80x9d results are equal to the largest and smallest unsigned numbers. One or the other value is output depending on the order relationship between the instruction source operands. In maximum/minimum operation, one of the source operands is output, depending on the relative size of the operands.
To implement these instructions, three tasks need to be performed. An intermediate result is produced, using a single adder, which may be embodied in an arithmetic unit, in accordance with the specific instruction being executed. Then, the appropriate result must be selected, either the xe2x80x9ctruexe2x80x9d or xe2x80x9cfalsexe2x80x9d value or the appropriate source operand.
The task of selecting the appropriate value as a final result may be complicated and time consuming. In particular, these tasks are complicated in that the instructions support different data types, that is, subvector operands having different lengths, as described hereinabove, each of which may be either signed or unsigned. Consequently, it becomes difficult to meet cycle time requirements if the three tasks are performed sequentially.
Thus, there is a need in the art for apparatus and methods for implementing vector compare and vector maximum and vector minimum instructions, that are sufficiently fast to meet cycle time requirements. In particular, there is a need in the art for performing, in parallel, the tasks of generating an intermediate result and selecting a mode dependent result value.
The aforementioned needs are addressed by the present invention. Accordingly, there is provided, in a first form, a compare and maximum/minimum apparatus. The apparatus includes a compare generation unit having first and second operand inputs operable for receiving first and second vector operands, the compare generation unit operable for receiving an instruction signal, and outputting one or more second signals in response to the first and second operands and the instruction signal. Also included is selection circuitry operable for receiving the one or more of second signals, at least one operand signal, and one or more comparison value signals, wherein the selection circuitry selects one of the operand signals and the one or more of the comparison value signals in response to the one or more second signals.
There is also provided, in a second form, a method of compare and maximum/minimum generation. The method includes generating a set of first signals in response to an executing instruction, and generating a set of second signals in response to a carry-out signal and the set of first signals. Also included is the step of selecting for outputting one of a set of output signals including one or more operand signals and a predetermined set of comparison value signals in response to the set of second signals, wherein the first and second carry-out signals are generated in response to a pair of subvector operands, and the result signal is generated in response to the executing instruction.
Additionally, there is provided, in a third form, a data processing system which includes a central processing unit (CPU), and a memory operable for communicating instructions and operand data to the CPU, in which the CPU includes instruction decode circuitry and compare and maximum/minimum circuitry coupled to the memory. The decode circuitry is operable for receiving the instructions, and the compare and maximum/minimum circuitry, is operable for receiving the operand data from the memory, and operable for selecting a one of a plurality of output signals, wherein the plurality of output signals includes the operand data and a preselected set of xe2x80x9ctruexe2x80x9d and xe2x80x9cfalsexe2x80x9d signals, the compare and maximum/minimum circuitry selecting the one of the plurality in response to the operand data and an instruction signal from the decode circuitry.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention.