Processors (for example, microprocessors and microcontrollers) are commonly employed to perform tasks involving strings of character values. One such task is to compare two strings of character values, and to return one value (for example, a zero) if the two strings are determined to be the same but to return another value (for example, a one) if the two strings are determined to be different. Each character value may, for example, be an eight-bit ASCII character value that represents a character. A binary ASCII value of “01001000” represents a capital “H” character. A binary ASCII value of “01000101” represents a capital “E” character. A binary ASCII value of “01001100” represents a capital “L” character. A binary ASCII value of “01001111” represents a capital “O” character. A binary ASCII value of “00000000” represents a special ASCII character called the “NULL” character. The NULL character value is placed at the end of the string of character values where the string is stored in memory to delineate the end of the character string.
Consider an example in which a program executing on a processor is to read character values out of memory and to determine whether the character values are a certain string of character values. FIG. 1 (Prior Art) illustrates a portion of a C language program that performs a string compare task. Line 10 indicates that a first string of characters values (representing the character string “HELLO”) is pointed to by a first pointer HTSTR1. This pointer points to the location in memory where the first eight-bit ASCII character value of the first string HTSTR1 is found. Character values of the first string are stored in corresponding string of adjacent memory locations. A NULL character value (“00000000”) is present at the end of the string of memory locations after the last character value representing the capital “O”. This NULL character value indicates the end of the first string of character values.
Line 11 indicates that a second string of characters values (representing the character string “HELLO”) is pointed to by a second pointer HTSTR2. This second pointer points to the location in memory where the first character value of the second string is found. The end of the second string is also marked by a NULL character value (“00000000”).
Line 12 calls a string compare function “STRCMP”. The STRCMP function compares the first string pointed to by the pointer HTSTR1 to the second string pointed to by the pointer HTSTR2. If the two strings are equal, then the value of the function is zero, otherwise the value of the function is one. The value of the function is assigned to the digital value ERROR. Accordingly, ERROR is set to zero if the two strings match, whereas ERROR is set to one if the two strings do not match.
The string compare task is so commonly needed that a programmer does not typically need to write code to perform the string compare operation. Rather, the programmer writes the STRCMP function call into the C program where a string compare is needed. When the C program is compiled, the C compiler supplies the STRCMP function with a previously written and compiled portion of machine code which, when executed, performs the string compare function.
FIG. 2 (Prior Art) illustrates an amount of assembly code, which when compiled and executed, carries out the STRCMP function of FIG. 1. Line 13 is a label “STRCMP”. Line 14 loads register R0 of a processor with the pointer HTSTR1. This pointer points to the memory location where the first character value of the first string is stored. Line 15 loads register R1 of the processor with the pointer HTSTR2. This pointer points to the memory location wherein the second character value of the second string is stored. Line 16 is a label for LOOP. Line 17 obtains the pointer HTSTR1 out of register R0, retrieves the content of the memory location pointed to by the pointer, and loads that value into register R2. After the load, the LD instruction automatically increments the content of R0. After execution of the LD instruction, register R0 therefore contains a pointer that points to the location in memory where the second character value of the first string is stored.
The LD instruction of line 18 is similar to the instruction of line 17. The pointer HTSTR2 out of register R1 retrieves the content of the memory location pointed to by the pointer, and loads the value into register R3. Then the pointer value in R1 is incremented so that the content of R1 points to the second character of the second string in memory.
Line 19 is the compare to zero instruction CPZ that takes a single operand, in this case the content of register R2. Execution of the CPZ instruction compares the content of register R2 to zero. If the result is true (the content of register R2 is zero), then the CPZ instruction sets a zero bit (“Z”) in a flag register in the processor. Otherwise, the zero bit is cleared. In the present example, the first character of the first string (see line 10) is “H”. Execution of line 17 loaded a digital “01001000” into register R2. The compare to zero of line 19 therefore compares “01001000” to “00000000”, and does not set the zero bit. The zero bit is therefore a digital zero (i.e., cleared).
Next, line 20 causes program execution to jump to the label “DONE” if the zero bit is true (i.e., is set). In the present example, the first character of the first character string was not a NULL character, so the zero bit was not set in line 19, and program execution does not jump to the DONE label.
Next, line 21 compares the content of register R3 to zero and sets the zero bit if the content of register R3 is zero. In the present example, the first character of the second character string is not a NULL character. The compare of line 21 does not result in the zero bit being set. Line 22 therefore does not result in program execution jumping to the DONE label. It is therefore seen that the instructions of lines 19-22 check to make sure that one of the first character values of the two strings is not “00000000”. If one of the first character values is “00000000”, then program execution is made to jump to the DONE label.
Next, line 23 causes the ASCII value of the first character of the first string in register R2 to be subtracted from the ASCII value of the first character of the second string in register R3. If the result is zero, then the zero bit (“Z”) in the flag register is set, otherwise it is cleared. In the present example, the first character of the first string is “H” and the first character of the second string is “H”. The content of register R2 is therefore a digital “01001000” and the content of register R3 is a digital “01001000”. The SUB instruction of line 23 therefore sets the zero bit in the flag register.
Next, line 24 causes program execution to jump back to the label “LOOP” if the zero bit is set (the first character values of the two strings are identical). Otherwise, the zero bit is cleared, and no jump is taken. In the present example, the SUB instruction of line 23 set the zero bit, so the JP instruction of line 24 results in program execution jumping back to the LOOP label.
In line 17, the pointer in register R0 was incremented at the end of execution of the previous LD instruction of line 17. Accordingly, when program execution jumps back to the LOOP label, register R0 contains a pointer that points to the memory location where the second character value of the first string is stored. Similarly, register R1 contains a pointer that points to the memory location where the second character value of the second string is stored. Therefore, in a second pass through the loop, the second character values of the two strings are tested to see if one of them is the NULL character value, and if neither is then the two character values are compared to see if they match. This process repeats, character value by character value, through the strings.
When the “O0000000” character values at the ends of the two strings are loaded into registers R2 and R3, the character value in register R2 is checked in line 19 to see if it is zero. In this case, R2 does contain “00000000”. Program execution jumps to the DONE label. Line 27 subtracts the content of register R2 from the content of register R3. In the present example, R2 contains “00000000” representing the NULL at the end of the first string and R3 contains “00000000” representing the NULL at the end of the second string. The SUB instruction of line 27 places the result of the subtraction in the R2 register, thereby overwriting the content of the R2 register. In the present example, there are ASCII NULL character values terminating both strings. The subtraction of line 27 therefore results in a zero being placed in register R2.
Next, the return instruction RET causes program execution to return to the line 12 that called the STRCMP function. If the value in R2 is zero, then the two strings matched. If, on the other hand, the value in R2 is not zero, then when a NULL value (“00000000”) was encountered in one string the characters at corresponding locations in the two strings did not match thereby indicating that the two strings did not match.
Note in the example of FIGS. 1 and 2 that the loop includes two compare to zero CPZ instructions and associated jump JP instructions to test each string for a NULL termination character value. The compare and jump instructions are in lines 19-22. Executing these testing instructions to look for NULL character values consumes processing resources and increases the number of instructions in the loop. Where the strings of character values being compared are long, the loop may be traversed many times. The instructions used to look for the NULL termination values therefore are executed many times and add significantly to the time required to carry out the overall higher level string compare task. An improved processor architecture and instruction set is desired.