Vector processors generally enable increased program execution speed by providing a vector processing unit, which includes a number of scalar units/processors, to process multiple data elements or data arrays in parallel. The number of scalar units/processors available is often referred to as the vector length. Instructions executed by a vector processing unit are vector instructions which may specify both an operation and arrays of data on which to operate in parallel. Each scalar unit/processor executes the operation on corresponding elements of arrays of data. Vectorizing compilers exist which typically convert code from a natural form, for example a form convenient for human programmers to read and write, into a form that is suitable for execution by the vector processor. They typically identify independent instructions of an operation, rearrange the corresponding data operands into data arrays, and convert them into the corresponding vector instruction. This process is called vectorization.
These existing compilers, however, often fail to vectorize regions of code due to dependency problems. For example, two different instructions cannot be executed in parallel if the execution of the second instruction depends in some way on the results of the execution of the first instruction. When such dependency problems are detected, the compiler may designate regions of code of increased size as unsuitable for vectorization. As a result, the potential of vector processing may not be fully realized.
Although the following Detailed Description will proceed with reference being made to illustrative embodiments, many alternatives, modifications, and variations thereof will be apparent to those skilled in the art.