In recent years, highly specialized embedded microprocessors, e.g.□ Digital Signal Processors (DSPs), are required for real time processing of digitized analog signals, e.g.□ handling audio, video, graphics and communication etc. tasks.
A typical computing scenario involves executing the same or almost the same sequence of operations on different elements of a large data set, e.g. an array. In this situation, a traditional computing model, where a single instruction (such as load, store, or integer addition) operates on a single data element, is not very efficient.
Thereby, people design and develop a Single Instruction Multiple Data (SIMD) architecture, which improves the data processing performance of a program by executing the same type computing on different data elements of a parallel vector. Most of existing high performance processors support the SIMD architecture, and these processors include a plurality of function units, some of which are configured to process scalar data, and others are combined together to process structured SIMD vector data. The SIMD architecture is generally used to process vector data for high performance computing or multimedia data types, such as color information coded by using triple (r, g, b) format, or coordinate information coded by using quadruple (x, y, z, w) format and so on.
The detailed description about the SIMD architecture can be seen in the following references 1-4:                1. “Auto-Vectorization of Interleaved Data for SIMD”, Dorit Nuzman, Ira Rosen and Ayal Zaks, PLDI'06 Jun. 10-16, 2006, Ottawa, Ontario, Canada, p. 132-142 (reference 1);        2. “Vectorization for SIMD Architectures with Alignment Constraints”, Alexandre E. Eichenberger, Peng Wu and Kevin O'Brien, PLDI'04, Jun. 9-11, 2004, Washington, D.C., USA, p. 82-93 (reference 2);        3. “Compilation techniques for multimedia processors”, A. Krall and S. Lelait, International Journal of Parallel Programming, 28(4): 347-361, 2000 (reference 3); and        4. “Code Optimization Techniques for Embedded Processors, Methods, Algorithms, and Tools”, R. Leupers, Kluwer Academic Publisher, Boston, 2000 (reference 4), which are incorporated herein by reference.        
Although the SIMD architecture improves the data processing performance hugely, the SIMD architecture requires a memory address of a vector operand to be vector aligned, that is to say, requires a vector pointer to be vector aligned. Thereby, in the process of compiling a source program, whether the pointer is vector aligned or not is required to be judged before the vector operand is loaded into a register, if the pointer is not vector aligned, the pointer is required to be aligned, and the register is required to be turned (shifted) before the register loaded with the operand is operated.
Actually, most of vector pointers of SIMD instructions are vector aligned, so it is not necessary to do the above-mentioned judgment, but for a compiler, it is very difficult to determine whether the vector pointer is vector aligned or not. Thereby, for the vector, the pointer of which is vector aligned itself, many fussy and useless codes are generated in the compiling process, the complexity of the code is increased and the data processing performance of the program is decreased consequently.