A (e.g., hardware) processor, or set of processors, executes instructions from an instruction set, e.g., the instruction set architecture (ISA). The instruction set is the part of the computer architecture related to programming, and generally includes the native data types, instructions, register architecture, addressing modes, memory architecture, and interrupt and exception handling.
Certain functions may include operations on vectors containing multiple fixed-sized data elements. Certain operations on a plurality of vectors may multiply each fixed-sized element from one vector with a corresponding fixed-sized element of another vector to produce a product for each pair of elements. As used herein, the term “corresponding” refers to vector elements that occupy a same relative position within their associated vectors. To generate precise products, each of the products of such pair of corresponding fixed-sized vector elements is double-sized, requiring at least twice as many bits as the fixed size. The memory and register resources required to hold the double-sized products, especially when vectors are involved, can be costly.
Complex multiplication requires multiplying real and imaginary parts with each other. A common language representation of complex number vectors is to have the real part on even elements of a vector and the imaginary part on the corresponding odd elements of the vector (e.g., a+ib and c+id representation in a vector A, A[0]=a; A[1]=b; A[2]=c; A[3]=d). Due to this representation and the fact that real elements should be multiplied by corresponding real numbers and corresponding imaginary numbers (and vice versa), using the regular multiplication instructions requires combination of shuffle instructions and fused multiply add.