1. Field of the Invention
The present invention relates generally to an improved data processing system and in particular, to a method and apparatus for generating code. Still more particularly, the present invention relates to a computer implemented method, apparatus, and computer usable program code for automatic code generation for complex arithmetic reduction for architectures lacking cross data-path support.
2. Description of the Related Art
Data processing systems are increasingly used for complex mathematical operations. Complex mathematical operations involve the use of imaginary numbers. In mathematics, an imaginary number is a complex number whose square is a negative real number. Any complex number can be written as “a+bj” where “a” and “b” are real numbers and “j” or alternatively “i” is the imaginary unit, which is equal to the square root of −1. The number “a” is the real part of the complex number, and “b”, is the imaginary part.
Imaginary numbers can be used in a variety of concrete or real-world applications in the field of science and other related technical areas, such as signal processing, dynamics, applied mathematics, control theory, electromagnetism, quantum mechanics, and cartography. For example, electrical engineers can express electrical voltage values and alternating current values using imaginary or complex numbers, which are referred to as phasors. Although phasors are values expressed in imaginary numbers, phasors represent real voltages that can cause damage to both people and equipment, even if their values contain no “real part”.
In computing, single instruction multiple data (SIMD) is a technique employed to achieve data level parallelism, as in a vector or array processor. First popularized in large-scale supercomputers, smaller-scale single issue multiple data operations have now become widespread in personal computer hardware. Today, the term is associated almost entirely with these smaller units.
For machines with single issue multiple data units without cross data-path support, such as VMX or synergistic processing units (SPU), data involved in the operations must be reorganized or simdized to create multiple operations that achieve the purpose of the original complex multiply or complex divide. Reorganizing the data is expensive in terms of operations, processing power, memory, and time.
VMX is a floating point and integer single issue multiple data instruction set architecture (ISA) extension to the Power Architecture. Synergistic processing units are part of processor architectures, such as the Cell Broadband Engine™. Cell Broadband Engine, Cell B.E., and Cell are trademarks of the Sony Corporation and/or the Sony Computer Entertainment, Inc., in the United States, other countries, or both and are used under license therefrom.
Both the VMX and the SPE have 16 byte wide single issue multiple data units that are capable of processing 16 chars, 8 shorts, 4 single precision floating points, or 4 integers per single issue multiple data instruction. The synergistic processing unit is also capable of processing 2 double precision floating points per single issue data instruction.
As a result, simdizing operations, such as complex multiply for such SIMD units without cross data-path support typically does not yield performance improvement over the performance of scalar code for most machines. This is the case because typical single instruction multiple data hardware multiplies that are aligned in memory require so many data shuffles to align the data that performing the reorganization often exceeds the benefits of simdization.