This invention is directed to a method and apparatus for retrieving information from a computer database. More particularly, the invention provides a system for identifying within a computer database records which contain text matching, or similar to, an operator-designated input expression. The invention is applicable, for example, as word look-up device in an electronic dictionary.
Since the emergence of relatively low-cost mass data storage devices, industry has relied on computers to store information for ready access and updating. Early systems for retrieving the stored information were relatively crude. Typically, an operator seeking a specific data record had to scan a lengthy data printout to obtain the record's key. The operator was then required to enter this key to cause the data record to be printed or displayed on the console.
Today, non-computer users are often called upon to perform their own computer searches to retrieve information stored in computer databases. Libraries, for example, store cataloging records on-line to facilitate access to library collections. Trademark search firms utilize computerized trademark databases to speed the identification of registered trademarks and pending trademark applications. Publishing companies provide computer-readable dictionaries for use with word processors to enhance the capabilities of automated office work stations.
This increased computer use has created a need for improved database search apparatus. Rather than relying upon obscure record keys, these systems must perform searches based on operator-designated words or expressions. At a minimum, a search apparatus must permit the identification of database records which contain text exactly matching an input expression. A more sophisticated apparatus must match not only exact expressions but also "wild card" expressions, i.e., those which include special characters, e.g., "#" or "*", which match any intermediate or subsequent characters. For example, the look-up expression "characteriz*" might be used to attempt to match database records having any of the strings "characterize", "characterizes", "characterizing", or "characterized".
The art currently provides a variety of wild-card database searching systems. While clearly an improvement over prior facilities, the wild-card systems suffer drawbacks. For one, the systems often match an over-inclusive set of database records. In the preceding example, if the operator wished to find all records containing forms of the verb "to characterize", he or she would enter the wild card expression "characteriz*". However, in addition to matching the verbal forms "characterize", "characterizes", "characterizing", and "characterized", the wild card expression might also match records having the nominal forms "characterizer", "characterizers", "characterizer's", "characterizers'", "characterization", "characterizations", "characterization's", and "characterizations'".
Moreover, the currently available systems do not recognize misspellings. If, for example, the operator misspells the literal portion of the search string, e.g., instead of "characteriz*" the operator enters "caracteriz*", the search apparatus will not find any matching records. This seriously impedes the utility of such a system for application as an electronic dictionary. Likewise, these systems are difficult to use in trademark search applications. There, the operator is required to enter all possible phonetic spellings of the mark of interest in order to find all trademarks and tradenames which sound the same.
An object of the invention, accordingly, is to provide an improved database search facility. More particularly, an object of the invention is to provide a search facility responsive to a textual input expression for locating database records having text which matches, or is similar to, the input expression.
Another object of the invention is to provide a database search facility which can match records in spite of operator spelling errors.
Still another objection of the invention is to provide a database search facility which locates database records having inflectional forms which differ from that of the designated input expression.
Yet another object of the invention is to provide a database search apparatus which is user-friendly and which operates with sufficiently high speed for convenient on-line use and operator interaction.
Further, an object of the invention is to provide a database search apparatus which matches one-part expressions, e.g., the word "characterize", as well as multiple-part expressions, e.g., the phrase "carte blanche".
Other objects of the invention are evident throughout the description which follows.