Early computers generally utilized a scalar processor that included a single logic unit that sequentially executed a single instruction on a single operand pair at a time. Computer programs and programming languages were accordingly designed to execute sequentially. Modern computers today may now include a vector processor that executes more than one instruction at a time. The vector processor may include a central processing unit that implements an instruction set having instructions that operate on one-dimensional arrays of data called vectors. Vector processors can greatly improve performance on certain workloads such as numerical simulation and similar tasks.
Some optimizing compilers feature automatic vectorization. Automatic vectorization is a compiler feature where particular parts of sequential programs are transformed into equivalent parallel programs to produce code which utilize a vector processor. Automatic vectorization in parallel computing is a special case of automatic parallelization where a computer program is converted from a scalar implementation which processes a single pair of operands at a time, to a vector implementation which processes one operation on multiple pairs of operands at once.
Vectorization can lead to gains without programmer intervention, especially on large data sets. Vectorization can sometimes also lead to slow execution because of pipeline synchronization, data movement timing, and other issues.