In some cases, processors designed for receiving a single data set packed with a plurality of data elements (e.g. SIMD (Single Instruction Multiple Data) processors) require permutation of the data elements contained in the single data set. Such requirement is fulfilled by the use of a shuffle instruction for permuting the data elements.
The following explains operations according to the PSHUFB instruction described in Non-Patent Literature 1 as an example shuffle instruction.
As shown in FIG. 1, the PSHUFB instruction is an instruction to produce output packed data 103 from an input shuffle pattern 101 and input packed data 102.
The shuffle pattern 101 is composed of eight indices, and each index has a width of 8 bits. Each of the input packed data 102 and the output packed data 103 includes eight data elements, and each data element has a width of 8 bits.
Location numbers are given to the data elements contained in the input packed data 102, which specifically are 0, 1, 2, . . . , 7 from right to left. Each number indicates the index of the shuffle pattern 101 that identifies the destination of the corresponding data element.
For example, when this PSHUFB instruction is executed, the leftmost data element “A” of the input packed data 102, which corresponds to the index “7”, is moved to the location of the index “7” in the shuffle pattern 101, which is the second rightmost index of the shuffle pattern 101. The second rightmost data element “G” of the input packed data 102, which corresponds to the index “1”, is moved to the locations of the indices “1” in the shuffle pattern 101, which are the third and the fourth indices from the left of the shuffle pattern 101.