1. Technical Field
This invention relates to merge operations using VMX instructions in an SIMD processor. More particularly, the invention pertains to an improved efficiency in merging two or more streams of data.
2. Description of the Prior Art
SIMD (Single Instruction, Multiple Data) is a technique employed to achieve data level parallelism in a computing environment. This technique is commonly applied to a vector or array processor in which a processor is able to run mathematical operations on multiple data elements simultaneously. In the past there were a number of dedicated processors for this sort of task, commonly referred to as digital signal processors (DSPs). The main difference between SIMD and a DSP is that the latter were complete processors with their own instruction set, whereas SIMD designs rely on the general-purpose portions of the central processing unit to handle the program details. The SIMD instructions handle the data manipulation only. In addition, DSP's also tend to include instructions to handle specific types of data, sound or video, whereas SIMD systems are considerably more general purpose.
An application that may take advantage of SIMD is one where the same value is being added to or subtracted from a large number of data points, a common operation in many multimedia applications. With a SIMD processor, data is understood to be in blocks, and a number of values can be loaded all at once. In addition, a SIMD processor will have a single instruction that effectively manipulates all of the data points. Another advantage is that SIMD systems typically include only those instructions that can be applied to all of the data in one operation. In other words, if the SIMD system works by loading up eight data points at once, the mathematical operation being applied to the data will happen to all eight values at the same time. Although the same is true for any superscalar processor design, the level of parallelism in a SIMD system is typically much higher.
Merge operations are known in the art as operations which merge two or more sorted data streams into a single data stream. The following is sample code illustrating an algorithm of a basic merge operation that merges content of array A with content of array B to an output stream identified as array C:
While ((Apos < Acount) && (Bpos < Bcount)) {If (A[Apos] > B[Bpos]) {// A > BC[Cpos++] = B[Bpos++];} else {// A < BC[Cpos++] = A[Apos++];}}
Among operations based on the merge operation employed for information retrieval are Merge AND and Merge OR. The Merge AND operation outputs data to an output stream only when the same values are included in both input streams. The following is sample code for a Merge AND operation performed without SIMD instructions:
While ((Apos < Acount) && (Bpos < Bcount)) {If (A[Apos] > B[Bpos]) {// A > BBpos++;} else if (A[Apos] < B[Bpos]) {// A < BApos++;} else {// A = BC[Cpos++] = A[Apos++];Bpos++;}}
The Merge OR operation outputs unique data values from both input stream to an output stream. Duplicated data are omitted. The following is sample code for a Merge OR operation performed without SIMD instructions:
While ((Apos < Acount) && (Bpos < Bcount)) {If (A[Apos] > B[Bpos]) {// A > BC[Cpos++] = B[Bpos++];} else if (A[Apos] < B[Bpos]) {// A < BC[Cpos++] = A[Apos++];} else {// A = BC[Cpos++] = A[Apos++];Bpos++;}}
As illustrated above, both the Merge AND operation and the Merge OR operation without SIMD instructions include conditional branch instructions for each operation of an element. A conditional branch is a basic logical structure that resembles a fork in the road where there are at least two paths that may be selected, but only one is chosen. The following is an example of a conditional branch: if a certain condition exits, then the application will perform one action, whereas if the condition does not exist, the application will perform another action. The conditional branches of the prior art Merge AND and Merge OR operations are taken in an arbitrary order with roughly a fifty percent probability for random input data.
It is difficult for branch prediction hardware to predict branches. Therefore, there is a need for a solution that employs the Merge AND and/or Merge OR operations that reduces the number of conditional branch instructions.