1. Field of the Invention
This invention relates to improving computer processing speed by enhancing Superword Level Parallelism.
2. Description of the Related Art
It is well known that computer processing speed has increased through the use of parallel processing. One form of parallel processing relies on a Single Instruction Multiple Data (SIMD) architecture. SIMD architecture processes multiple data packed into a vector register in a single instruction, such as SSE for Pentium, VMX for PPC970, CELL, and Dual FPU for BlueGene/L. The type of parallelism exploited by SIMD architecture is referred to as SIMD parallelism. The process to automatically generate SIMD operations from sequential computation is referred to as extracting SIMD parallelism.
One approach to extracting SIMD parallelism from input code is the Superword Level Parallelism (SLP) approach. The SLP approach packs multiple isomorphic statements that operate on data, located in adjacent memory, into one or more SIMD operations. The drawback to SLP is that it relies heavily on identifying isomorphic  computation. Two statements are “isomorphic” with respect to each other if each statement performs the same set of operations in the same order as the other statement and the corresponding memory operations access adjacent memory locations. Table 1 presents an example of four statements (in C syntax) that are isomorphic in relation to each other.
TABLE 1Statements With Isomorphic Relationshipa[4i + 0] = b[4i + 0] + c[4i − 1]a[4i + 1] = b[4i + 1] + c[4i + 0]a[4i + 2] = b[4i + 2] + c[4i + 1]a[4i + 3] = b[4i + 3] + c[4i + 2]
The statements in Table 1 are isomorphic in relation to each other because each statement performs two load operations, one addition operation, and one store operation in the same order. Furthermore, the corresponding memory operations in these statements (or any statements with an isomorphic relation) must access operations that are either adjacent or identical. For example, the memory access of a [4i+0] is adjacent to the memory access of a [4i+1]. Likewise, a [4i+1] is adjacent to a [4i+2]. Similarly, the memory accesses of “b” and “c” are adjacent.
Extracting SIMD parallelism using SLP requires that the relationship between statements meet the isomorphic definition. Opportunities may exist for extracting SIMD parallelism from statements that do not meet the isomorphic definition. For example, computation on the real and imaginary parts of complex numbers often does not satisfy the isomorphic definition.
Today's SIMD architectures are introducing Multiple Instruction Multiple Data (MIMD) instructions that may perform different computation on different elements of a vector. For instance, the ADDSUBPS instruction in SSE3 (Streaming SIMD Extensions by Intel) performs an add operation on odd elements of input vectors and a subtract  operation on even elements of input vectors. As this trend continues, there is an increased need to extract non-isomorphic SIMD parallelism.
What is needed is a way to extract SIMD parallelism from computations that do meet some but not all of the isomorphic criteria.