Interaction with automated programs, systems, and services, has become a routine part of most people's lives—especially with the advent of the Internet. Web surfing or browsing for instance may even be the “new” national pastime for a certain segment of the population. In accordance with such systems, applications such as word processing have helped many become more efficient in their respective jobs or with their personal lives such as typing a letter or e-mail to a friend. Many automated features have been added to these applications such as tools for formatting documents in substantially any desired font, color, shape, or form. One tool that has been appreciated and well received by many users is a spell checking application that is either invoked by a user from the word processor to check all or portions of a respective document and/or invoked to run in the background to check spelling as users are typing. Generally, in order to perform accurate spell checking, a dictionary of “valid strings” may be employed by the spell checking application. If a spell checker encounters a string not in the dictionary, it may hypothesize that the string is a spelling error and attempt to find the “closest” string in the dictionary for the misspelled string. Most spell checkers provide a list of possible matches to the user, whereby if the match is on the list, the user can select the word having the corrected spelling from the list. Other spell checking features may perform automatic corrections—if so configured by the user.
Spell checking for word processing, however, presents only a partial view of potential areas that may be applicable to assist users when entering information into a file or document. For example, with all the potential web sites and services available, users often navigate between sites by explicitly typing in all or portions of the site name. As many have come to find out, if the site information is entered incorrectly, the cost in time to re-navigate can become quite high. Language processors employed in search engines or other applications often process user queries and may attempt to distinguish actual user commands from incorrectly entered information. As can be appreciated however, the type of information that may be entered for a query to a search engine may be quite different in structure or form than typically employed in a word processing application. Thus, tools that check words on a somewhat individual and isolated basis in a word processor application may have little or no utility when applied to information generated from general query data.
Browser or other type queries for information present a unique problem for spell checking applications, since the queries often consist of words that may not be found in a standard spell-checking dictionary such as proper names. Another problem is that a word in a query may have been entered incorrectly, but not be spelled incorrectly. Thus, the manner in which people enter text into a type-in line, for example, such as an input box to a search engine is often very different than typing for word processing. Both what is entered, and the types of errors people make with respect to query input are also quite different in nature. As such, a standard dictionary, while suitable for spell checking in the context of word processing, may not be appropriate for type-in-line spell checking.
A dictionary is an important component of any spell checker since the information contained therein provides the foundation to determine incorrect spellings. However, for many applications where spell checking is desired (e.g., text input provided to input boxes), a standard dictionary is not optimal for the problem. For instance, to spell check text input to the input box of a search engine, a dictionary should include strings such as “pictures of the President”, “hanging chad”, and “Apolo Anton Ohno” in order to check more recent events or information that may be of interest. As can be appreciated, these and a plurality of other type strings would not appear in a standard dictionary. One possible approach to creating such a dictionary may be to derive a subset of potential entries from a log of what users are typing into a particular location such as a search engine or language processor. Unfortunately, a problem with this approach is that the query logs will generally also contain a large number of spelling errors—which is a major reason why spell checking is employed in the first place. Since the logs contain errors, a lexicon built from the logs cannot be utilized reliably for spell checking.