Data processing, particularly with respect to streaming data and parallel processing, has suffered from the limits of existing processor instructions with respect to low speed relative to the speed of the streaming data, as well as the complexity of the number of instructions and the amount of code required to execute needed functions.
Computer processors allowing for packed SIMD (Single Instruction Multiple Data) type of operations, i.e. performing one operation on multiple sets of data through the use of parallel processing, are designed, for example, to speed up standard digital filter operations in digital signal processing (DSP). This is achieved by exploiting some of the characteristic symmetry inherent in digital filter equations. These operations find a broad range of applications in most signal processing algorithms like Fast Fourier Transform (FFT), FIR/IIR filters and trigonometry, as well as in statistical analysis algorithms. This class of operations is specifically for data manipulation within a packed register.
As a specific example, a plurality of small equal length numbers are packed into a 64 bit double word as elements of a vector and then the elements are extracted and parallel operated upon independently. The SIMD operation requires that these data elements be aligned within the double word. However, the data as received is frequently not aligned for the SIMD operation and there have been many attempts to obtain such alignment of the input data in an efficient manner, with respect to such factors as processing time, power, hardware complexity and software complexity.
There is a need for efficient data alignment and packing.
U.S. Pat. No. 5,898,601 issued Apr. 27, 1999 to Gray et al, with respect to SIMD processing, compresses bit formats provided in a first packed data sequence with five instructions: generates a second packed data element sequence by copying the first packed data sequence, masks a portion of the first packed data, shifts data elements of the first packed data sequence independently by separate shift counts, masks a portion of the second packed data sequence, and joins the second and first packed data sequences.
U.S. Pat. No. 5,922,066, issued July 1999 to Seongrai Cho et al, discloses a hardware data aligner of a SIMD processor, wherein the aligner shift operation changes the positions of data elements in a data vector to align a data to a base address that is an even multiple of the number of bytes in a data vector. An example of usage would be the alignment of input data to 32 bit boundaries for four bytes of input data.
Both U.S. Pat. No. 5,933,650 issued Aug. 3, 1999 to van Hook et al and U.S. Pat. No. 6,266,758B1 issued Jul. 24, 2001 to van Hook et al relate to data alignment and SIMD processing. A first data vector is loaded from a memory into a first register and a second vector is loaded from the memory into a second register. A starting byte in the first register is determined by being specified as a constant in an alignment instruction and the starting byte specifies the first byte of an aligned vector. A subset of elements is selected from the first register and the second register; a first width vector from the first register and the second register is extracted beginning from the first bit in the first byte of the first register continuing through the bits in the second register. The elements from the subset are then replicated into elements in a third register in a particular order suitable for subsequent SIMD processing. FIG. 5 shows an example of extracting an aligned vector from two vectors.
U.S. Pat. No. 6,094,637 issued Jul. 25, 2000 to Hong reverses the order of data elements in a vector register for SIMD processing.
U.S. Pat. No. 6,175,892 issued Jan. 16, 2001 to Sazzad et al specifies a row and/or column of data with a SIMD instruction operand, particularly for a two-dimensional register array.
U.S. Pat. No. 6,460,127 B1 issued Oct. 1, 2002 processes a plurality of samples of an incoming signal in parallel, by including operations such as those in the listing spanning columns 26 and 27.
U.S. Pat. No. 6,243,803 discloses a method and apparatus for computing packed absolute differences with a plurality of sign bits using SIMD add circuitry.
Example shift instructions for SIMD processing include: PSLLW, PSLLD, PSLLQ-Shift Packed Data Left Logical; PSRLW, PSRLD, PSRLQ-Shift Packed Data Right Logical; and PSRAW, PSRAD-Shift Packed Data Right Arithmetic. These instructions shift the destination elements the number of bits specified in the count operand.
Therefore, there is a need for an improved instruction to improve data manipulation, particularly for subsequent parallel processing at a higher speed relative to existing instructions and of less complexity to execute needed functions, particularly including aligning and packing data elements from a stream of data.