Modern computer systems use a wide variety of architectures. One particular computer architecture that has proved useful is the single instruction multiple data (SIMD) architecture, which has found application in general purpose computing as well as specific applications, such as media and graphics processing.
An advantage to SIMD architectures includes the capacity to perform parallel processing of multiple data streams while reducing the total number of instructions. For example, one particular instruction type used by SIMD processors is a permutation, or “deal” instruction, which is typically used for re-ordering bytes or words of data from one sequence to a second sequence. For instance, a graphics application may require that a stream of data having four data objects arranged in a first sequence {A, B, C, D} be rearranged to the order {B, A, C, D} and, optionally, expanded into four separate double-sized data objects {0x00, 0x00, 0x00, B}, {0x00, 0x00, 0x00, A}, {0x00, 0x00, 0x00, C}, {0x00, 0x00, 0x00, D}. Examples of data manipulations involving expansion of the input data objects include sign expansion and zero expansion operations.
As processing power increases, the number and size of data objects in the input data sequences also increases. Known methods for manipulating sequences of data objects are unnecessarily complex, leading to more processor cycles, delays and an unnecessary burden on programmers who are required to find feasible ways of configuring the many types of manipulations required.
Accordingly, preferred embodiments of this invention seek to provide a new technology that uses a permuter to perform expansion instructions. In particular, preferred embodiments use a standard permuter in a manner which reduces the number of operations required to achieve certain data manipulations and lessens the burden on the programmer of generating control information for data manipulations.