Computer systems have become increasingly pervasive in our society. The processing capabilities of computers have increased the efficiency and productivity of workers in a wide spectrum of professions. As the costs of purchasing and owning a computer continues to drop, more and more consumers have been able to take advantage of newer and faster machines. Furthermore, many people enjoy the use of notebook computers because of the freedom. Mobile computers allow users to easily transport their data and work with them as they leave the office or travel. This scenario is quite familiar with marketing staff, corporate executives, and even students.
As processor technology advances, newer software code is also being generated to run on machines with these processors. Users generally expect and demand higher performance from their computers regardless of the type of software being used. One such issue can arise from the kinds of instructions and operations that are actually being performed within the processor. Certain types of operations require more time to complete based on the complexity of the operations and/or type of circuitry needed. This provides an opportunity to optimize the way certain complex operations are executed inside the processor.
Communications applications have been driving microprocessor development for more than a decade. In fact, the line between computing and communication has become increasingly blurred due, in part, to the use of textual communication applications. Textual applications are pervasive within consumer segments, and among numerous devices, from cell phones to personal computers, requiring faster and faster processing of text information. Textual communication devices continue to find their way into computing and communication devices in the form of applications, such as Microsoft® Instant Messenger™, email applications, such as Microsoft® Outlook™, and cell phone texting applications. As a result, tomorrow's personal computing and communications experience will be even richer in textual capability.
Accordingly, the processing or parsing of text information communicated between computing or communication devices has become increasingly important for current computing and communication devices. Particularly, interpretation by a communication or computing device of strings of text information include some of the most important operations performed on text data.
The Boyer-Moore string search algorithm is an efficient string searching algorithm that is a standard benchmark for practical string searches developed by Robert S. Boyer and J Strother Moore in 1977. The algorithm preprocesses the string being searched for (the pattern), but not the string being searched in (the text). It is well-suited for applications in which the text does not persist across multiple searches. The Boyer-Moore algorithm uses information gathered during the preprocess step to skip sections of the text, resulting in a lower constant factor than many other string search algorithms. In general, the algorithm runs faster as the pattern length increases.
Search operations on strings of text information may be computationally intensive, but offer a high level of data parallelism that can be exploited through an efficient implementation using various data storage devices, such as for example, single instruction multiple data (SIMD) registers. Vectorized searches have been implemented in various libraries, e.g., using single instruction multiple data SIMD instructions. For example, Streaming SIMD Extension 4 (SSE4) for certain Intel® architecture processors, and particularly SSE4.2, includes SIMD instructions that perform character searches and comparisons on two operands of a particular number of bytes (e.g., sixteen) at a time. Some current architectures require multiple operations, instructions, or sub-instructions (often referred to as “micro-operations” or “uops”) to perform various logical and mathematical operations on a number of operands, thereby diminishing throughput and increasing the number of clock cycles required to perform the logical and mathematical operations.
For example, an instruction sequence consisting of a number of instructions may be required to perform one or more operations necessary to interpret particular words of a text string, including comparing two or more text words represented by various datatypes within a processing apparatus, system or computer program. However, such prior art techniques may require numerous processing cycles or extra instructions, and thus may cause a processor or system to consume unnecessary power and/or processing cycles in order to generate their results. Furthermore, some prior art techniques may require additional processing of data during the search to be useful in standard benchmarks for practical string searches such as a Boyer-Moore string search.
To date, potential solutions to such performance and efficiency limiting issues have not been adequately explored.