Computer systems are highly useful for compiling and processing large amounts of data. With the increase of storage capacity in modern microcomputers, it is possible to store large databases within a microcomputer system in a home or office environment. The introduction of laser discs for storing vast amounts of data have made it possible for a microcomputer to access databases including encyclopedias, textbooks or the like.
As more information is added to the microcomputer system, it is increasingly important to have efficient techniques for locating desired information. Many systems of the prior art use search techniques to locate user-specified text. The user enters the desired search request in a manner specified by the application program. Frequently, a search string will involve more than one word term and may be accompanied by logical expressions relating the word terms. For example, a user may wish to search an encyclopedia for a document containing information about computer memory devices, but does not want information about read-only memories (ROM). The search string may appear as "computer AND memory NOT ROM." Of course, many different formats of search strings may be used by different application programs, and the search string above is provided only as one example of the many possible search strings. For the sake of simplicity, word terms are referred to throughout this document by the letters A, B, C, D, etc. It is obvious to those skilled in the art that the word terms represented by these letters may be any user-selected words.
Computer text searchers must evaluate both the word terms selected by the user and the user-specified conditions that interrelate the word terms. In the example above, a text file must contain the word terms "computers" and "memory" and must not contain the word term "ROM". Thus, there are word terms that must be detected and logical terms interrelating the word terms that must be satisfied. Typical examples of logical terms that define the interrelationships of word terms include AND, OR, NOT, PHRASE, and NEAR. The AND term is an exclusive logical operator and will detect only word terms that include all of the specified elements. For example, A AND B AND C will only detect the occurrence of all three word terms A, B, and C. It will not detect occurrences of only A and B nor will it detect A and C. The OR term is in inclusive logical operator and will detect any word terms interrelated by the logical OR operator. For example, A OR B OR C will detect all occurrences in a text file of either A, B, or C. The NOT logical operator is also an exclusion operator and will not detect any occurrences in which the specified word term is included. For example, NOT A instructs the text searcher not to include any text file in which the term A appears. The PHRASE logical operator is similar to the AND logical operator except that it relates to sequences of multiple word terms which must appear in the specified sequence and must occur one word term apart. For example, PHRASE ABC will only detect the occurrence of the three consecutive word terms ABC and will not detect AB or ABDC or CBA. The NEAR logical operator relates word terms and the location within the text file. For example, A NEAR5 B will detect all occurrences of the word term A within five word terms of B. Similarly, A OR B NEAR10 C will detect all occurrences of either A or B that occur within ten word terms of C. Wild card designators, such as "*," are often used in combination with characters to provide a broad search. For example, a search for the text string "bake*" will locate any words that begin with "bake," such as baked, baker, bakery, etc.
Often, text searchers use a binary search tree to evaluate logical operators one at a time. Each time a logical operator is evaluated, an intermediate list is created to store the results of the binary search. The text file must be examined again to evaluate the next logical operator and another intermediate list is created. Finally, the various intermediate lists are evaluated to determine the final result of the search. For example, the search string A AND B AND C using a binary search tree requires a search of the text file for the logical operator term A AND B. The results are stored in an intermediate file, which for the sake of convenience will be called X. Thus, X=A and B. The text file must be searched a second time for the logical operator term X AND C. This search system requires multiple passes through the text file and also requires the storage of intermediate lists which may occupy large amounts of memory space.
Binary search trees often encounter problems in situations that require simultaneous evaluation of multiple word terms. For example, the search string A NEAR3 B NEAR3 C cannot be simultaneously evaluated by a binary search tree. Instead, the approach taken by a binary search tree is to evaluate A NEAR3 B and store the results, which we shall call X, in an intermediate file. In a second pass, the binary search tree evaluates X NEAR3 C. A sample list of occurrences may include A at location 1, B at location 3 and C at location 5. The binary search tree first evaluates A NEAR3 B and produces location 1 as a result of the search. In the second pass of the binary search tree, X NEAR3 C will detect no occurrences since location 1 is not within 3 word terms of location 5, even though B is within 3 word terms of C. Prior art systems go to great lengths to avoid this type of problem, but the result is a complex search routine with slow execution times and even greater memory requirements.
Many of these prior art systems are inefficient because they require several passes through the entire text file to locate the user-selected text which satisfies the user-specified conditions. Other techniques used by databases on large computer systems are not appropriate in a microcomputer environment because of the requirements for large amounts of memory. Therefore, it can be appreciated that there is a significant need for a system and method for simultaneously evaluating Boolean operators in a microcomputer environment.