1. Field of the Invention
The present invention relates generally to string comparisons, and in particular to using vector instructions to perform string comparisons.
2. Description of the Related Art
String comparison operations are frequently performed in a variety of computer applications. A string may be defined as a stream or array of characters stored in a contiguous sequence. The characters of the string are often represented using one or two bytes. In many software environments, a null terminating character may serve as a reserved character to signify the end of string. Such a string may be referred to as a null-terminated string or “ASCIIZ” string. The null terminating character, sometimes represented as NUL or ‘Ø’ which may be a control character with a value of zero. NUL is present in many character sets.
Using single instruction multiple data (SIMD) processing, string comparison operations have been implemented to operate on vectors containing a plurality of characters. SIMD processing is an approach wherein a single instruction operates on a packed vector containing a plurality of elements. A SIMD or vector instruction specifies an instruction that will be repeated for an entire vector of independent data values, thereby essentially describing a large number of operations in a single instruction. Traditionally, a comparison of strings takes at least two operations, a first to compare the characters for equality and a second to check whether the strings contain the null terminating character. A prior art code example of a SIMD string comparison operation is shown in FIG. 1.
Code 100 begins by loading the first vector A[i] from a first string (instruction 110), loading the second vector B[i] from a second string (instruction 115), and then comparing vectors A and B (instruction 120). After the comparison, the next instruction (125) is to branch if any mismatches between corresponding elements of vectors A and B are found. Then, the next instruction (130) is to compare the elements of vector A to the null terminating character (NUL). If NUL is found in any of the elements of vector A, then the end of a string has been reached. Alternatively, the elements of vector B could be compared to NUL. After this comparison, the next instruction (135) will branch out of the loop if any of the elements of vector A is NUL. Next, the counters of the first and second strings may be incremented (instruction 140) in preparation for the next compare operation of the next section of the first and second strings. Then, the next instruction (145) will branch back to the top (105) of the loop.
As shown in FIG. 1, code 100 is an eight-instruction loop. Code 100 has three different branches, two branches (125 and 135) in the middle and one branch (145) at the bottom of the loop back to the top of the loop. Typically, one of the two branches (125 or 135) in the middle of the loop will be taken to exit from the loop. Consequently, execution of the loop is complex and inefficient since it is difficult to predict which of the branches will be taken.
Therefore, a need exists in the art for a less complex and more efficient string comparison operation. In view of the above, improved methods and mechanisms for performing string comparison operations are desired.