1. Field of the Invention
The invention relates to computer communications systems and more particularly to spell checking of resource identifications in a network environment.
2. Description of Related Art
In order to access specific World-Wide-Web (WWW) pages, users must often enter the Uniform Resource Locator (URL) which provides the address of the page on a remote server. However, as WWW browsers evolved, the focus of the user interface has been to allow users to access remote pages by selecting hypertext links, thus often removing the need to manually enter URLs. Scant attention has been paid to the problems inherent in manual URL entry. Yet, the explosive growth of the WWW has made it inconvenient to follow a long series of hypertext links to retrieve a page desired by the user: in fact, companies, organizations and individuals often provide their URLs in television advertisements, on printed materials, and verbally. This has led to a growing number of instances when the user would prefer to directly enter the URL in the browser.
A major problem with the manual entry of URLs is the introduction of spelling errors, which are particularly common because of the characteristics of URL syntax and structure. Often long, the URL often includes terms, such as "http", "com", "org", "gif", "jpeg", that are not commonly known by users. URLs may also be in a foreign language, especially for those users in non-English speaking countries. Additionally, the URL may include odd special characters such as .about., , and @ that are difficult to type and hard to remember. The fact the URLs interpret upper and lower case letters differently is yet another source of user input error. Finally, the user is often relying on a quickly made note or just his memory from a brief appearance of a URL or from a spoken URL in an advertisement. All of these factors taken together provide a rich basis for the introduction of spelling errors during manual entry of URLs.
In order to assist the user with manual URL entry a spelling checker is needed. Spell checking in general is well established in the art, with numerous different implementation schemes. The central idea of a spelling checker is to take the word in question and compare it to a dictionary of legal spellings to find one or more words that are spelled roughly the same way and to then provide the user the ability to chose the correct word from a list presented by the spelling checking program.
However, traditional spelling checkers, using the prior art, are unsuitable for use in the WWW environment for several reasons. The dynamic nature of the WWW, where new URLs are constantly being created, precludes the use of a static dictionary. The sheer number of URLs precludes the use of a dynamic dictionary: as of April 1996 there were more than 30 million URLs on the WWW. Additionally, since the WWW operates in a client-server environment, only the server knows what URLs are valid for accessing WWW pages residing on that server. Servers often contain files (pages) that are not intended for general use and the server administrators rely on the fact that only users who know the exact URLs can retrieve those files. The introduction of sophisticated spelling checkers for URLs must take this fact into account. Finally, the prior art provides no mechanism for utilizing knowledge obtained from other users, behavior.
As an example of the prior art, Netscape's Navigator WWW browser performs a simplistic spelling check on manually entered URLs. Specifically, the program tries to identify and correct problems with the protocol and server names. The program will try adding "http://" to the URL if no protocol is specified, it will also add "www." before and ".com" after the domain name if they are not present in the manually entered URL. These spelling check capabilities are simple but helpful, but are not sufficiently robust or extensive to solve the general problem of spelling errors in manually entered URLs.