The Internet, and more particularly the world-wide-web, as it is know today was reportedly first created by Timothy Berners-Lee at the European Laboratory for Particle Physics. It is there, where in 1989, that he developed a precursor to the Mosaic browser and set forth the protocols and software that present distributed information in a hypertext format.
The new hypertext format permitted documents to reference other documents using links. The links permitted users to navigate through non-sequential information. The hypertext format was well-received by the general population, in part, because it simplified using the Internet by no longer requiring that the users know arcane command-line protocols, such as File Transfer Protocol (FTP).
The world-wide web, or simply the Web, was designed to consolidate many different types of information. Thus, the Web uses existing protocols such as Multipurpose Mail Extensions (MIME) and Transmission Control Protocol/Internet Protocol (TCP/IP) to transfer data. To integrate this disparate data the Web uses a combination of Uniform Resource Locators (URLs), the Hypertext Transfer Protocol (HTTP), the Hypertext Markup Language (HTML) and the Common Gateway Interface (CGI) in its operation.
People commonly think of the web based on URLs. A URL is a filename that includes a server name. The URL can also include user name information in alphanumeric form and protocol-specific arguments and options. A URL can be broken down into the following parts:
<scheme>:<scheme-specific name>
where the <scheme> is the communications protocol or scheme being used (e.g., HTTP). The <scheme-specific name> varies in format depending on the scheme, but is commonly a word or trade name. For example, a URL for a corporation may read:http://www.affinitypartners.com/rxdrugstores/help.html
The “http” denotes use of the HTTP protocol. The “www.affinitypartners.com” is the URL name of the server that houses the corporate information. The “/rxdrugstores/help.html” refers to a particular subdirectory and filename located on (physically or logically) the server. As the HTTP protocol has become the Web's default standard, most web browsers will assume that the HTTP protocol is being used unless an alternative protocol is explicitly specified. One of the versions of HTTP is version 1.1., which supports all major browsers and web servers. A full description of HTTP is provided in IETF's RFC 2068.
In general, only alphanumeric characters are used in URL addresses. Characters that have been reserved for other purposes or are unsafe to use directly include the following:
Reserved:                ; / ? @=&        
Unsafe:                < >A # % { } | \ ^ { } '        
If unsafe or reserved characters are used in a URL, URL encoding is generally performed. URL encoding typically comprises the replacement of the offending character with a new three-character symbol that comprises the “%” sign followed by the hexadecimal digits that represent the offending character.
Regardless of the HTTP version being used, whether it is 1.1 or later, there is an inherent limitation on the number of usable scheme-specific names. The limitation on usable characters combined with the limited number of names that are easily remembered by the user has placed an artificial limitation on expansion of the Web.
The present invention recognizes this limitation and proposes a solution that is simple and easily implemented, that permits the use of all possible scheme-specific name combinations.