An on-line information system typically includes one computer system (a server) that makes information available so that other computer systems (clients) can access the information. The server manages access to the information, which can be structured as a set of independent on-line services. The server and client communicate via messages conforming to a communication protocol and sent over a communication channel such as a computer network or through a dial-up connection.
Information sources managed by the server may include files, databases, and applications on the server system or on an external computer system. The information that the server provides may simply be stored on the server, may be converted from other formats manually or automatically, may be computed on the server in response to a client request, may be derived from data and applications on the server or other machines, or may be derived by any combination of these techniques.
The user of an on-line service typically uses a browser program executed on the client system to access the information managed by the on-line service. The browser enables the user to selectively view, search, download, print, edit, and/or file the information managed by the server. On-line services are available on the World Wide Web (WWW), which operates over the global Internet. The Internet interconnects a large number of otherwise unrelated computers or sites. Similar services are available on private networks called Intranets that may not be connected to the Internet, and through local area networks (LANs). The WWW and similar private architectures provide a "web" of interconnected document objects. On the WWW, these document objects are located at various sites on the global Internet. A more complete description of the WWW is provided in "The World-Wide Web," by T. Berners-Lee, R. Cailliau, A. Luotonen, H. F. Nielsen, and A. Secret, Communications of the ACM, 37 (8), pp. 76-82, August 1994, and in "World Wide Web: The Information Universe," by Berners-Lee, T., et al., in Electronic Networking: Research, Applications and Policy, Vol. 1, No. 2, Meckler, Westport, Conn., Spring 1992.
Among the types of document objects in an on-line service are documents and scripts. Documents that are published on the WWW are written in the Hypertext Markup Language (HTML). This language is described in HyperText Markup Language Specification--2.0, by T. Berners-Lee and D. Connolly, RFC 1866, proposed standard, November 1995, and in "World Wide Web & HTML," by Douglas C. McArthur, in Dr. Dobbs Journal, December 1994, pp. 18-20, 22, 24, 26 and 86. Many companies are also developing enhancements to HTML. HTML documents can be created using programs specifically designed for that purpose or by executing script files.
The HTML language is used for writing hypertext documents, which are more formally referred to as Standard Generalized Markup Language (SGML) documents that conform to a particular Document Type Definition (DTD). An HTML document includes a hierarchical set of markup elements; most elements have a start tag, followed by content, followed by an end tag. The content is a combination of text and nested markup elements. Tags, which are enclosed in angle brackets (`&lt;` and `&gt;`), indicate how the document is structured and how to display the document, as well as destinations and labels for hypertext links. There are tags for markup elements such as titles and headers, text attributes such as bold. and italic, lists, paragraph boundaries, links to other documents or other parts of the same document, in-line graphic images, and for many other features.
The following lines of exemplary HTML briefly illustrate how the language is used:
Some words are &lt;B&gt;bold&lt;/B&gt;, others are &lt;I&gt;italic&lt;/I&gt;. Here we start a new paragraph.&lt;P&gt;Here's a link to the &lt;A HREF="http://www.microsoft.com"&gt;Microsoft Corporation &lt;/A&gt;home page. PA1 &lt;IMG SRC="bigsailboat.gif"&gt; PA1 &lt;A HREF="bigsailboat.gif"&gt;&lt;IMG SRC="littlesailboat.gif"&gt;&lt;/A&gt; PA1 "scheme" identifies the access protocol (such as HTTP, FTP or GOPHER); PA1 "host" is the Internet domain name of the machine that supports the protocol; PA1 "port" is the transmission control protocol (TCP) port number of the appropriate server (if different from the default); PA1 "path" is a scheme-specific identification of the object; and PA1 "search" contains optional parameters for querying the content of the object.
This sample document is a hypertext document because it contains a hypertext link to another document in the line that includes "HREF=." The format of this link is described below. A hypertext document may also have a link to other parts of the same document. Linked documents may generally be located anywhere on the Internet. When a user is viewing the document using a web browser, the links are displayed as highlighted words or phrases. For example, using a web browser, the sample document above might be displayed on the user's screen as follows:
Some words are bold, others are italic. Here we start a new paragraph.
Here's a link to Microsoft Corporation home page.
In a web browser, the link may be selected, for example, by clicking on the highlighted area with a mouse. Typically, the screen cursor changes when positioned on a hypertext link. Selecting a link will cause the associated document to be displayed. Thus, clicking on the highlighted text "Microsoft Corporation" would fetch and display the associated home page for that entity.
Similarly, the HTML language also provides a mechanism (the image or "IMG" element) that enables an HTML document to include an image that is stored as a separate file. When the end user views the HTML document, the included image is displayed as part of the document, at the point where the image element was positioned in the document. The following line of HTML briefly illustrates how the language is used to incorporate an image into an HTML document:
The following line of HTML shows how the language provides a hyperlink from a displayed thumbnail image (littlesailboat.gif) to the original (fullsize) image (bigsailboat.gif):
When the user is viewing the Web page that includes the displayed thumbnail image using a web browser, the hyperlink connection from the thumbnail image to the original image is activated by the selection of the displayed thumbnail image. In the prior art, it has been necessary to manually create the hyperlink from a thumbnail image to the original fullsize image, so that the original image is retrieved and displayed when the user selects the thumbnail image.
Each document object in a web has an identifier called a Universal Resource Identifier (URI). These identifiers are described in more detail in T. Berners-Lee, "Universal Resource Identifiers in WWW: A Unifying Syntax for the Expression of Names and Addresses of Objects on the Network as used in the World-Wide Web," RFC 1630, CERN, June 1994; and T. Berners-Lee, L. Masinter, and M. McCahill, "Uniform Resource Locators (URL)," RFC 1738, CERN, Xerox PARC, University of Minnesota, December 1994. A URL allows any object on the Internet to be referred to by name or address, such as in a link in an HTML document as shown above. There are two types of URIs: a Universal Resource Name (URN), and a Uniform Resource Locator (URL). A URN references an object by name within a given name space. The Internet community has not yet defined the syntax of URNs. A URL references an object by defining an access algorithm using network protocols. An example of a URL is "bttp://www.microsoft.com". A URL has the syntax "scheme://host:port/path/search" where
URLs are also used by web servers and browsers on private computer systems, Intranets, or networks, and not just for the WWW.
There are generally two types of URLs that may be used in the hypertext link: absolute URLs, and relative URLs. An absolute URL includes a protocol identifier, a machine name, and an optional HTTP port number. A relative URL does not include a protocol identifier, machine name or port, and must be. interpreted relative to some known absolute URL called the base URL. The base URL is used to determine the protocol identifier, machine name, optional port and base directory for a relative URL. For further discussion of URL format and usage, see the document "Uniform Resource Locators," Internet Request for Comments (RFC) 1738, by T. Berners-Lee, L. Masinter, M. McCahill, University of Minnesota, December 1994. For further discussions of relative URL format and usage, see "Relative Uniform Resource Locators," RFC 1808, by R. Fielding, University of California, Irvine, June 1995.
A hypertext link to an electronic document is specified by one of several HTML elements. One of the parameters of an HTML element for a hypertext link is the Uniform Resource Locator (URL) that serves as the identifier for the target of the link. An HTML document may have a base element defining an absolute URL that specifies the base URL for that document. If the document has no base element, then the absolute URL of the document is used as the base URL. The base element provides a base address for interpreting relative URLs when the document is read out of context.
For example, FIG. 9 shows text with a document URL 200, a base element 202, a hypertext link with an absolute URL 204, and a hypertext link with a relative URL 206, which is evaluated with respect to base URL 202 to produce a resulting URL 208. As an additional example, FIG. 10 shows text with a document URL 210, no base element, a hypertext link with an absolute URL 212, and a hypertext link with a relative URL 214, which is evaluated with respect to document URL 210 to produce a resulting URL 216.
A site at which documents are made available to network users through a web server is called a Web site. A web server is a computer program that allows a computer on the network to make documents available to the rest of the WWW or to other computers on a private network. The documents are often hypertext documents in the HTML language, but may be other types of document objects as well, and may include images, audio, and/or video information. The information that is managed by the web server includes hypertext documents that are stored on the server (or on other computers) or are dynamically generated by scripts on the web server. Web servers have been implemented for several different computer platforms, including Sun Corporation's SPARC II workstation running the UNIX operating system, and personal computers with Intel Corporation's PENTIUM processor running Microsoft Corporation's MS-DOS operating system and/or the WINDOWS graphic operating environment.
A web server program typically maps document object names that are known to the client to file names on the server file system. This mapping may be arbitrarily complex, and any author or program that tries to access documents on the web server directly would need to understand this name mapping. A user (typically using a computer other than that used to execute the web server program) who wishes to access documents available on the network at a Web site runs a web browser. The combination of the web server and web browser communicating using an HTTP protocol over a computer network is referred to as a web architecture. The web browser program allows the user to retrieve and display documents from web servers. Some of the popular web browser programs are: NAVIGATOR browser from the Netscape Communications Corporation, of Mountain View, Calif.; MOSAIC browser from the National Center for Supercomputing Applications (NCSA); WINweb browser, from Microelectronics And Computer Technology Corporation of Austin, Tex.; and INTERNET EXPLORER from Microsoft Corporation of Redmond, Wash. Web browsers have been developed to run on different platforms, including personal computers with the Intel Corporation's PENTIUM processor running Microsoft Corporation's WINDOWS graphics environment, and Apple Corporation's MACINTOSH personal computers.
The web server and the web browser typically communicate using the Hypertext Transfer Protocol (HTTP) message protocol and the underlying transmission control protocol/Internet protocol (TCP/IP) data transport protocol of the Internet. HTTP is described in Hypertext Transfer Protocol--HTTP/1.0, by T. Berners-Lee, R. T. Fielding, H. Frystyk Nielsen, Internet Draft Document, Oct. 14, 1995, and is currently in the standardization process. In HTTP, the web browser establishes a connection to a web server and sends an HTTP request message to the server. In response to an HTTP request message, the web server checks for authorization, performs any requested action, and returns an HTTP response message containing an HTML document in accord with the requested action, or an error message. The returned HTML document may simply be a file stored on the web server, or may be created dynamically using a script called in response to the HTTP request message. For instance, to retrieve a document, a web browser may send an HTTP request message to the indicated web server, requesting a document by reference to the URL of the document. The web server then retrieves the document and returns it in an HTTP response message to the web browser. If the document has hypertext links, then the user may again select one of the links to request that a new document or object be retrieved and displayed.
Request messages in HTTP contain a "method name" indicating the type of action to be performed by the server, a URL indicating a target object (e.g., a document or script) on the web server, and other control information. Response messages contain a status line, server information, and possible data content.
Historically, Web pages have been primarily text based and images having a relatively large size were infrequently included in the pages. Also, the relatively high cost of a fast connection to a network, such as provided by a T-1 line and a router, has caused many users to employ relatively slow network connections for viewing Web pages. Until recently, most users employing relatively slow network connections, such as a modem over a Plain Old Telephone Service (POTS) line, have been able to download and view most Web pages within a reasonable period of time (less than a minute). However, an increasing number of Web pages include at least one image. Since images are a popular medium of expression in Web pages, the relatively large size of images as compared to text has resulted in an increase in the average download time for users that employ relatively slow network connections. Thus, a need has arisen for a solution that enables the user employing a relatively slow network connection to download primarily image-based Web pages within a reasonable period of time.
One solution to shortening the download time of a large Web page has been to reduce the amount of data employed to display each image in the page. It is well known in the art that the amount of data comprising an image is directly proportional to the number of pixels in the displayed image and the number of colors employed to render the image. Any reduction in the size of an image included in a Web page tends to cause a corresponding decrease in the amount of data that must be downloaded by the user for viewing the image. A thumbnail image created from an original (full size) image typically conveys sufficient information so that a person viewing the thumbnail image is aware of the content of the original image. Thus, Web pages that display thumbnail images instead of full size images download more quickly and still communicate the intended expression to the user.
As noted above, if a user is creating or editing an existing Web page and wants to insert a thumbnail image to represent a larger image, it is necessary to first produce the thumbnail image, insert the thumbnail image into the Web page, and then manually associate a hyperlink back to the original larger image from the thumbnail image so that when the thumbnail image is selected by someone viewing the Web page, the full size original image will be retrieved and displayed. Clearly, it would be desirable to automatically produce a thumbnail image in a Web page from an original image, which is either previously disposed in an existing Web page or is intended for positioning in a new/existing Web page. Also, a hyperlink to the original image should automatically be created and associated with the thumbnail image in the Web page, so that when the user selects the thumbnail image, the hyperlink is activated and the original image is displayed.