As the use of portal Web sites, searchable engines providing links to Universal Resource Locators (URLs), URL address windows, applications with links to URLs and the like becomes more widespread, so have the chances for the occurrence of a number and variety of errors in URL input. For example, present browsers have the capabilities of displaying and/or navigating to a URL when any one or more of the following occurs: (1) when a protocol or resource identification is prefixed to navigate to a URL, (2) when a user enters a URL and begins by inputting “go”, “find”, “?” or the like, (3) when the URL is requested incident to a standard file system, i.e., when a local computer resource, such as a file or desktop link, requests a lookup, (4) when a user requests one or more of his/her favorite URLs or favorite titles, (5) when a user requests one or more of his/her historical URLs, (6) as part of an intranet lookup and (7) any other runtime service that implicates URL navigation or searching. If any of these succeeds, a browser navigates to the URL correspondingly. In brief, anytime a user's or application's services implicates the navigation to or searching on a URL, a URL is being specified which may result in mistaken URL input.
As for variety of errors: (1) an error may occur when inputting an intended URL and (2) a user may mistakenly input a first URL while a second URL is intended. Of the first kind, there are strictly protocol type errors, wherein a user, for instance, incorrectly adheres to the “http://” prefix protocol common to all Web site URL inputs or the like by omitting periods, substituting semicolons, incorrectly positioning or spelling the prefix and/or the like. These types of errors occur in part because the common parts of URLs are difficult to type and/or remember for some. It is easy to confuse a semicolon for a colon, for example, or to use a back slash instead of a forward slash.
In certain cases, present browsers already autocorrect certain types of mistakes. For example, with certain browsers, a rudimentary URL protocol check is performed after a user types a URL in the address base. This “autocorrect” function, however, only fixes a protocol's name and punctuation. For example, with respect to mistakes in the prefix of a URL entry, there is a known syntax that can be autocorrected easily. For instance, “http;” can be autocorrected to “http:”, “htttp” can be autocorrected to “http” and “http:\\” can be autocorrected to “http://”. By providing a simple static database of known frequent URL typographical errors (typos), present browsers can correct some of the simple mistakes that a lot of users make. Some of these error corrections can be applied before the URL is sent, others may apply only if the first attempt returns with a DNS failure.
These autocorrections happen automatically without user intervention. Such presently existing autocorrect features may also check if there is a pluggable protocol, such as “outlook” which is not in the standard protocol yet, before correcting the URL protocol. This mechanism works reasonably simply. If there is no protocol tag, the browser tries the standard HyperText Transfer Protocol (HTTP). If there is a protocol resource tag, the browser does not send the URL to autosearch even if there are spaces in the URL input.
It is noted, however, that the autocorrect feature currently only fixes protocols. For example, if a colon or semicolon is detected, it may be assumed that the colon or semicolon is part of a potentially mistyped URL scheme. The portions adjacent to the colon or semicolon are compared with the known protocols to see if it is a pluggable protocol before trying to correct it. Currently known protocols that are corrected in this manner include protocols relating to “http”, “ftp”, “file”, “gopher”, “mailto”, “news”, “nntp”, “telnet”, “wais”, “mk”, “https”, “local”, “shell”, “javascript”, “vbscript”, “about”, “snews” and “res”.
Present browsers can also use an autoscan function to try resolving the domain name service (DNS) name with a non-standard function. For instance, suffixes may be appended in the following order until an existing server is found: “.com”, “.org”, “.net” and “.edu”. It is noted that further top level domains might be added to such a scheme as well.
Currently, if a user types a correct DNS name, but a wrong subdirectory or page path, e.g. www.microsoft.com/wrongpage, the user will receive one of multiple HTTP error codes. If a server, such as the microsoft.com server is present to field the input, the server handles the wrong subdirectory error. If the server does not have an error handling page, however, which may be judged by the attached HyperText Markup Language (HTML) page size, current browsers generally bring up a static error page, such as HTML error resource page 15 of FIG. 1.
If the autocorrect and autoscan functions are traversed and the browser still does not have a resolution, the browser passes control and the URL to autosearch. The autosearch function first checks for the provider of autosearch, which can be customized by the end user in a customization page. If the default search provider is changed to a non-default provider, control is passed to the non-default provider. If the default autosearch provider has control, the provider checks which language is being utilized.
Assuming English, for instance, the provider then checks if the URL input has a period in the raw string without a space ‘ ’ in the raw string; if true, the default autosearch provider redirects to a name resolution provider (herein referred to as “NRP”), such as REALNAMES®, for resolution. If the name resolution provider can resolve the URL, navigation takes place directly to the site. If the name resolution provider cannot resolve the URL, this is the end of the process, i.e., the browser decides to call its built-in DNS error resource page, such as page 15 of FIG. 1. The NRP step is introduced to attempt multilingual domain name resolution. If the URL input does not meet the conditions of having a period without a space, the URL input is redirected to the default search engine, and a usual search on the URL input is conducted.
FIG. 2 illustrates these presently implemented techniques for URL input handling in more detail. At 200, if it is determined there is a connection problem, then an error page, such as error page 15, is invoked and displayed at 205. If the connection is valid, then at 210, it is determined whether there is a valid DNS name for the URL input, a valid intranet location for the URL input, etc. If so, then at 215, navigation to, or searching on, the URL input is performed. If not, however, control is transferred to the default search provider, if the default search provider is specified as explained above. At 220, it is determined whether the URL input has a period without a space. If so, then the URL input is redirected to NRP 240 in order to check against known names. If not, then at 225, a switch is made to the international (cultural, geographical, language, etc.) market and at 230, it is determined whether the URL input is a name for the NRP vis-à-vis the international market. If so, then the URL input is redirected to NRP 240. If not, then at 235, a default search is performed by the default search provider.
On the NRP side, once the URL input is redirected to the NRP 240, a determination is made whether the name can be resolved at 245. If not, then the browser displays the DNS error page 15 at 205. If so, then there is valid URL input and at 250, the flow is redirected to the site directly.
The current design, as described above, sends any URL with a period and without spaces to NRP 240 because NRP 240 can handle multilingual domain names, such as Chinese characters, in the domain name. However, it would be desirable to minimize the sending of names to NRP 240 that NRP 240 is unlikely to resolve because there is a roundtrip performance degradation associated therewith and NRP 240 is not always operationally reliable. Since the main value that NRP 240 adds is multilingual domain name resolution, it would be desirable to add a step of determining whether or not multilingual domain name resolution issues are present.
It would be further desirable to add additional intelligence on the client side for URL correction beyond mere protocol correction, and to leverage one or more databases including dynamic database(s) of current information about existing URLs.
There is thus a need for a mechanism that may be used in connection with a URL input operation to decipher when an input error has occurred, beyond a mere protocol error. There is a further need for a mechanism that may be used in connection with a URL input operation to decipher when a URL has an error with a high degree of confidence. There is still further a need for a mechanism that navigates to the correct URL as if the URL input error had not occurred. There is still further a need for a mechanism that determines whether a URL input error has occurred in a URL input operation vis-à-vis a plurality of dynamic data stores or sources, such as a dynamically updated Web-oriented dictionary combined with a static URL dictionary store.