1. Field of the Invention
In particular, the present invention describes an apparatus for performing arithmetic operations using a single control signal to manipulate multiple data elements. The present invention allows execution of shift operations on packed data types.
2. Description of Related Art
Today, most personal computer systems operate with one instruction to produce one result. Performance increases are achieved by increasing execution speed of instructions and the processor instruction complexity; known as Complex Instruction Set Computer (CISC). Such processors as the Intel 80286(trademark) microprocessor, available from Intel Corp. of Santa Clara, Calif., belong to the CISC category of processor.
Previous computer system architecture has been optimized to take advantage of the CISC concept. Such systems typically have data buses thirty-two bits wide. However, applications targeted at computer supported cooperation (CSCxe2x80x94the integration of teleconferencing with mixed media data manipulation), 2D/3D graphics, image processing, video compression/decompression, recognition algorithms and audio manipulation increase the need for improved performance. But, increasing the execution speed and complexity of instructions is only one solution.
One common aspect of these applications is that they often manipulate large amounts of data where only a few bits are important. That is, data whose relevant bits are represented in much fewer bits than the size of the data bus. For example, processors execute many operations on eight bit and sixteen bit data (e.g., pixel color components in a video image) but have much wider data busses and registers. Thus, a processor having a thirty-two bit data bus and registers, and executing one of these algorithms, can waste up to seventy-five percent of its data processing, carrying and storage capacity because only the first eight bits of data are important.
As such, what is desired is a processor that increases performance by more efficiently using the difference between the number of bits required to represent the data to be manipulated and the actual data carrying and storage capacity of the processor.
A microprocessor including an apparatus for shifting packed data. The apparatus includes a first shifter configured to perform a shift operation on a first packed data having multiple packed data elements by a shift count to produce a second packed data. The apparatus also includes a correction circuit which generates a third set of bits and multiple muxes which receive a corresponding bit of the second packed data and a corresponding replacement bit and a select input from a corresponding bit of the third set of bits to generate a corresponding bit of a shifted packed result.