Technical Field
Embodiments described herein generally relate to processors. In particular, embodiments described herein generally relate to processors that are able to process packed data.
Background Information
Many processors have Single Instruction, Multiple Data (SIMD) architectures. In SIMD architectures, instead of a scalar instruction operating on only one data element or pair of data elements, a packed data instruction, vector instruction, or SIMD instruction may operate on multiple data elements or multiple pairs of data elements concurrently (e.g., in parallel). The processor may have parallel execution hardware responsive to the packed data instruction to perform the multiple operations on the multiple data elements concurrently (e.g., in parallel).
In SIMD architectures multiple data elements may be packed within one register or memory location as packed data or vector data. In packed data, the bits of the register or other storage location may be logically divided into a sequence of multiple data elements. Each of the data elements may represent an individual piece of data that is stored in the register or other storage location along with other data elements commonly having the same size. For example, a 128-bit wide register may have two 64-bit wide packed data elements, four 32-bit wide packed data elements, eight 16-bit wide packed data elements, or sixteen 8-bit wide packed data elements. Each of the packed data elements commonly represents a separate individual piece of data (e.g., a color of a pixel, a graphical coordinate, etc.) that may be operated upon separately from the others.
Representatively, one type of packed data instruction, vector instruction, or SIMD instruction (e.g., a packed add instruction) may specify that a single packed data operation (e.g., addition) be performed on all corresponding pairs of data elements from two source packed data operands in a vertical fashion to generate a destination or result packed data. The source packed data operands may be of the same size, may contain data elements of the same width, and thus may each contain the same number of data elements. The source data elements in the same bit positions in the two source packed data operands may represent pairs of corresponding data elements. The packed data operation may be performed separately or substantially independently on each of these pairs of corresponding source data elements to generate a matching number of result data elements, and thus each pair of corresponding source data elements may have a corresponding result data element. Typically, the result data elements for such an instruction are in the same order and they often have the same size.
In addition to this exemplary type of packed data instruction, there are a variety of other types of packed data instructions. For example, there are those that have only one source packed data operand. For example, a packed data shift instruction may independently shift each data element of a single source packed data to produce a result packed data. Other packed data instructions may operate on more than two source packed data operands. Moreover, other packed data instructions may operate in a horizontal fashion on data elements within the same packed data operand instead of in a vertical fashion (e.g., on corresponding data elements between two source packed data operands). Still other packed data instructions may generate a result packed data operand of a different size, having different sized data elements, and/or having a different data element order.