Field
Embodiments relate to processors. In particular, embodiments relate to processors to concatenate packed data operation masks responsive to packed data operation mask concatenation instructions.
Background Information
Many processors have Single Instruction, Multiple Data (SIMD) architectures. The SIMD architectures generally help to significantly improve processing speed. In SIMD architectures, instead of a scalar instruction operating on only one data element or pair of data elements, a packed data instruction, vector instruction, or SIMD instruction may operate on multiple data elements or multiple pairs of data elements simultaneously or in parallel. The processor may have parallel execution hardware responsive to the packed data instruction to perform the multiple operations simultaneously or in parallel.
In SIMD architectures multiple data elements may be packed within one register or memory location as packed data or vector data. In packed data, the bits of the register or other storage location may be logically divided into a sequence of multiple fixed-sized data elements. Each of the data elements may represent an individual piece of data that is stored in the register or storage location along with other data elements typically having the same size. For example, a 256-bit wide register may have four 64-bit wide packed data elements, eight 32-bit wide packed data elements, sixteen 16-bit wide packed data elements, or thirty-two 8-bit wide packed data elements. Each of the packed data elements may represent a separate individual piece of data (e.g., a color of a pixel, etc.) that may be operated upon separately or independently of the others.
Representatively, one type of packed data instruction, vector instruction, or SIMD instruction (e.g., a packed add instruction) may specify that a single packed data operation (e.g., addition) be performed on all corresponding pairs of data elements from two source packed data operands in a vertical fashion to generate a destination or result packed data. The source packed data operands may be of the same size, may contain data elements of the same width, and thus may each contain the same number of data elements. The source data elements in the same bit positions in the two source packed data operands may represent pairs of corresponding data elements. The packed data operation may be performed separately or independently on each of these pairs of corresponding source data elements to generate a matching number of result data elements, and thus each pair of corresponding source data elements may have a corresponding result data element. Typically, the result data elements for such an instruction are in the same order and they often have the same size.
In addition to this exemplary type of packed data instruction, there are a variety of other types of packed data instructions. For example, there are those that have only one, or more than two, source packed data operands, those that operate in a horizontal fashion instead of a vertical fashion, those that generate a result packed data operand of a different size, those that have different sized data elements, and/or those that have a different data element order.