Search engines, such as Internet search engines, have been in use for some time. Such search engines permit the user to form a search query using combinations of keywords to search through a web page database containing text indices associated with one or more distinct web pages. The search engine looks for matches between the search query and text indices in the web page database, and then returns a number of hits which correspond to URL pointers and text excerpts from the web pages that represent the closest matches.
Some Internet search engines attempt to detect when a user has entered a query incorrectly. For example, the Google™ search engine employs a “Did you mean . . . ?” feature that essentially runs a spellchecker on user queries. The spellchecker attempts to detect when an entered word is misspelled by checking it against a database of common words and their misspellings. When a possible misspelling is detected, the search engine may provide to the user a prompt to invoke an alternative query in which the misspelled word is spelled correctly.
Some search engines utilize natural language processing (NLP) techniques. Word sense disambiguation, the process of identifying which sense of a word is used in any given sentence, is a common challenge in any semantic NLP system.
Several NLP systems deal with disambiguation by consulting a comprehensive body of world knowledge. This is done through hierarchies or ontologies, as well as many simple factual statements about the world. Entities are defined in relation to other entities, and semantic maps are created which assist in disambiguating words based on the context in which those words are used. The problem with this approach is that a successful disambiguation requires gigantic ontologies and relational maps that require a huge amount of effort and time to put together. Even the most successful efforts to date have fallen short of a human-like capacity to disambiguate based on context.