1. Field of the Invention
The present application relates generally to an improved data processing apparatus and method and more specifically to a mechanism for array reference safety analysis in the presence of loops with conditional control flow.
2. Background of the Invention
Multimedia extensions (MMEs) have become one of the most popular additions to general-purpose microprocessors. Existing multimedia extensions can be characterized as Single Instruction Multiple Datapath (SIMD) units that support packed fixed-length vectors. The traditional programming model for multimedia extensions has been explicit vector programming using either (in-line) assembly or intrinsic functions embedded in a high-level programming language. Explicit vector programming is time-consuming and error-prone. A promising alternative is to exploit vectorization technology to automatically generate SIMD codes from programs written in standard high-level languages.
Although vectorization has been studied extensively for traditional vector processors decades ago, vectorization for SIMD architectures has raised new issues due to several fundamental differences between the two architectures. To distinguish between the two types of vectorization, the latter is referred to as SIMD vectorization, or SIMDization. One such fundamental difference comes from the memory unit. The memory unit of a typical SIMD processor bears more resemblance to that of a wide scalar processor than to that of a traditional vector processor. In the VMX instruction set found on certain PowerPC microprocessors (produced by International Business Machines Corporation of Armonk, N.Y.), for example, a load instruction loads 16-byte contiguous memory from 16-byte aligned memory, ignoring the last 4 bits of the memory address in the instruction. The same applies to store instructions.
There has been a recent spike of interest in compiler techniques to automatically extract SIMD parallelism from programs. This upsurge has been driven by the increasing prevalence of SIMD architectures in multimedia processors and high-performance computing. These processors have multiple function units, e.g., floating point units, fixed point units, integer units, etc., which can execute more than one instruction in the same machine cycle to enhance the uniprocessor performance. The function units in these processors are typically pipelined.
In performing compiler based transformations of loops to extract SIMD parallelism, it is important to ensure array reference safety. That is, during compilation of source code for execution by a SIMD architecture, the compiler may perform various optimizations including determining portions of code that may be parallelized for execution by the SIMD architecture. This parallelization typically involves vectorizing, or SIMD vectorizing, or SIMDizing, the portion of code. One such optimization involves the conversion of branches in code to predicated operations in order to avoid the branch misprediction penalties encountered by pipelined function units. This optimization involves converting conditional branches in source code to predicated code with predicate operations using comparison instructions to set up Boolean predicates corresponding to the branch conditions. Thus, the predicates, which now guard the instructions, either execute or nullify the instruction according to the predicate's value, a process called commonly referred to as “if-conversion.”
In short, predicated code generated by traditional if-conversion generates straightline code by executing instructions from two mutually exclusive execution paths, suppressing instructions corresponding to one of the two mutually exclusive paths. It is quite common for one of these mutually exclusive execution paths to generate a variety of undesirable erroneous execution effects and, in particular, illegal memory references, when this path does not correspond to the chosen path. Accordingly, “if-conversion” might result in erroneous executions if it were not for the nullification of non-selected predicated instructions in accordance with “if-conversion”, and in particular for memory reference instructions in if-converted code.
Gschwind et al., “Synergistic Processing In Cell's Multicore Architecture”, IEEE Micro, March 2006 introduces the concept of data-parallel if-conversion which is being increasingly widely adopted for compilation for data-parallel SIMD architectures. Unlike traditional scalar if-conversion, data-parallel if-conversion typically targets code generation with data-parallel select as supported by many SIMD architectures, as described in co-pending and commonly assigned U.S. Patent Application Publication No. US20080034357A1, filed Aug. 4, 2006, entitled “Method and Apparatus for Generating Data Parallel Select Operations in a Pervasively Data Parallel System” to Michael K. Gschwind, because data-parallel SIMD architectures typically do not offer predicated execution.
Thus, traditional if-conversion guards each instruction with a predicate indicating the execution or non-execution of each instruction corresponding to one or another of mutually exclusive paths. The data-parallel if-conversion with data-parallel select described in Gschwind et al. executes instructions from both paths without a predicate and uses data-parallel select instructions to select a result corresponding to an unconditionally executed path in the compiled code exactly when it corresponds to a taken path in the original source code. Thus, while data-parallel select can be used to implement result selection based on taken-path information, data-parallel if-conversion with data-parallel select is not adapted to nullify instructions. This is because a vector instruction may have one part of its result vector selected when another part of its result vector is not selected, making traditional instruction predication impractical.
Alas, lack of nullification of non-selected instructions means that execution sequences corresponding to non-selected mutually exclusive paths that may generate a variety of undesirable erroneous execution effects, and in particular illegal memory references, when this path does not correspond to the chosen path, are unconditionally executed and errors associated therewith are not nullified.
Specifically, without further safety checks a compiler may erroneously transform loops in non-SIMD code into a SIMD representation that may unintentionally cause array references to exceed the bounds of the arrays, e.g., make a reference above the range of array, resulting in unsafe memory accesses. Such unsafe memory accesses may cause a program to not operate properly and, in serious situations, may result in memory corruption, or even abnormal premature program termination (e.g., by attempting to access protected memory).
To avoid such undesirable program errors due to overly aggressive optimization, compilers may use array reference safety information to derive whether it is safe to generate code using data-parallel select sequences using data-parallel if-conversion based on arrays references found in the code sequences to be optimized. However, array safety information is often not available to a compiler, e.g., when arrays are dynamically created or defined in another module not available to the compiler for analysis. To ensure correct execution, in a loop nest with conditionals, when array reference safety cannot be established with current analysis techniques, the compiler limits the vectorizing, i.e., SIMD vectorizing, of loops where the array definitions are not known at compile time.