The invention relates generally to vector-processing heavy applications, and relates more specifically to providing transparent auto-vectorization support for such applications.
Vectorization is the process of converting a vector operation from a scalar implementation, which operates on a pair of operands (i.e., an element of each vector) at a time, to a vectorized implementation, in which a single instruction can perform multiple operations concurrently on a pair of vector operands (i.e., multiple elements of each vector) at a time. For example, given vector A and vector B, where vector A and vector B both include four elements, a vector instruction can be generated that, in a single operation, adds the first element of vector A to the first element of vector B, the second element of vector A to the second element of vector B, the third element of vector A to the third element of vector B, and the fourth element of vector A to the fourth element of vector B. Automatic vectorization is the automatic transformation of a series of operations performed sequentially (the scalar version), one operation at a time, to operations performed in parallel, several at once (the vectorized version).
High performance applications that rely on vector processing operations are present in many domains. According to conventional techniques, making use of vectorization in code development tasks can be accomplished in several ways. For example, there are three common approaches to utilizing single instruction, multiple data (SIMD)-based vectorization. One approach is to manually vectorize the code. In other words, the first approach directly makes use of the vectorization instructions wherever this is possible in the code. This can be accomplished either by inlining the assembly language instructions into the source code, or by using intrinsic functions (“intrinsics” for short), which are supported by many compilers. While the use of intrinsics helps considerably, the recoding process can still be time-consuming and tedious. In particular, less sophisticated developers usually avoid this approach because it requires non-trivial code changes. The second approach employs specialized libraries that implement common domain-specific operations (e.g., Fast Fourier Transform) required by the data processing analytics. A problem with this approach is that the libraries must be maintained and, eventually, extended to incorporate new operations as needed. The third approach employs general-purpose auto-vectorizing compilers. This is certainly the easiest on the developers, but it is only as good as the compiler's ability to identify code sites amenable to auto-vectorization transformations. Thus, conventional approaches are limited in terms of providing auto-vectorization support.