The Internet is a wide area network that connects computer systems of local area networks and intranets all over the world. Some of the systems can generally be classified as server computers and client computers. The clients are mostly operated by end-users, and the servers provide various types of network services to the clients.
One type of service sourced by server computers are Web pages. Web pages are multimedia document that can include textual, graphic, video, and audio content. Most Web pages are generated using the HyperText Mark-up Language (HTML), although the pages can include data encoded according to other formats, e.g., MPEG, JPEG, GIF, and so forth. The Web pages can be simple, that is, only black and white text, or the pages can be ornate with color, video, and synchronous audio, etc.
The most common way to access a Web page is by using a Web browser, for example, the Netscape Navigator.TM., the Microsoft Internet Explorer.TM., or through some Internet service such as AOL. The Web pages are located by specifying their addresses. A Web page address is indicated by a Universal Resource Locator (URL). The URL can either be specified directly, or by "clicking" on a "hot-link" in a previously retrieved page.
Typically, the pages are transferred from the servers to the clients using the HyperText Transfer Protocol (HTTP). HTTP is an application level protocol that is layered on top of the Internet protocol. In a TCP implementation, the Internet protocol is defined by the layers of the TCP/IP "stack."
Both in the Internet and in the intranets, the "effective" bandwidths of communication paths between servers and clients can vary greatly. The effective bandwidth depends on transmission rates, number of "hops," error rates, latencies, and so forth. Since servers and clients can be connected via a wide range of network technologies, the effective bandwidth can span at least six orders of magnitude. This means that a Web page that includes both text and graphic images designed for a high bandwidth path will be inappropriate for use by client computers connecting to servers over paths with much lower bandwidths.
It is possible to manually design a simplified Web page for use by clients using low capacity communication paths, but these pages would be boring for users of clients connected via high bandwidth paths. For example, a content rich Web page can include a "hot link" to a less ornate "mirrored" page. The user can then decide, depending on the bandwidth of the network connection, which page to view. However, this requires the user to make an all-or-nothing decision. The user either sees a boring page, or a very complex one, rather than a page that is automatically optimized to whatever the effective available bandwidth is.
In the prior art several methods are known for statically adjusting the content of Web pages. The Netscape Navigator.TM. browser supports a special feature called the "lowsrc" tag. The "lowsrc" tag allows an HTML-coded Web page to specify the use of two separate codings for a given image. The browser initially loads a low-resolution version of the image, then automatically loads a high resolution version to replace the low-resolution image. This means a low-resolution image is produced fairly quickly, assuming that the user doesn't stop the download or shift to another page. If the user waits long enough, then the high- resolution image is generated, as stated in "http://www.netscape.com/assist/net_sites/impact_docs/ creating-high-impact-docs.html", by utilizing the LOWSRC extension that is part of IMG. Netscape Navigator will load the image called "lowres.jpg" on its first layout pass through the document. Then, when the document and all of its images are fully loaded, Netscape Navigator will do a second pass through and load the image called "highres.gif" in place. This means that you can have a very low-resolution version of an image loaded initially; if the user stays on the page after the initial layout phase, a higher-resolution (and presumably bigger) version of the same image can "fade in" and replace it.
Using the "lowsrc" tag does not automatically avoid the time required to load a high-resolution image. In fact, it increases the time because the client must first load a low-resolution image that is subsequently overwritten. Also, this method has no way to adapt other aspects of the page, or to adapt to the page to anything other than either a low bandwidth path or a high bandwidth path.
In another method, as stated in "http://hawk.fab2.albany.edu/delong/shadow/shadow.htm," wherein a low resolution file is displayed initially, then the high-resolution file is gradually painted over the top of it enabling users on slow connections to see the basic image quickly, or wait and see the full image, the Netscape.TM. "lowsrc" tag is combined with a "shadow" page. The user can interrupt the down-loading of a "pure" page to switch to down-loading the shadow page. This is only a minor improvement of the original Netscape.TM. "lowsrc" tag, and generally requires an educated and somewhat agile user.
In another method, a proxy server is used. A proxy server is a relay computer system that is located somewhere on the network path between the server and client computers. Normally, proxy computers have high bandwidth connections to servers and low bandwidth connections to clients. The proxy converts high-resolution images to low-resolution images while the Web page is relayed from a server to a client computer. Because the low bandwidth path is the one between the proxy and the ultimate client, this can improve performance.
However, this method is not automatic. The user must explicitly notify the proxy whether to receive a low or high resolution image. Presumably, the user bases this decision on past experience. In addition, the method applies to intermediate systems in the Web rather than directly to the sources of the content, i.e., a server. Consequently, any transformation cannot be based on a semantic understanding of the content. That is, the method does not provide an optimal choice of source material, but only a simplistic degradation of the content. Furthermore, the benefits of the method are lost when the bandwidth of the network path between a server and the proxy is low or variable.
None of the above methods use information about network characteristics, such as bandwidth, error rates, and latencies, to make automatic coding or content decisions at the source, they all depend on some explicit user input.
Steven McCanne in his Ph.D. dissertation (McCanne, S., Scalable Compression and Transmission of Internet Multicast Video, Ph.D. thesis, University of California Berkeley, UCB/CSD-96-928, December 1996 addresses the problem of adapting real-time video transmissions in a multicast network. Multicast is a totally different environment than the point-to-point connections of the World Wide Web. The "Related Work" section of McCanne's dissertation describes a number of previous approaches to the problem of real-time adjustment of audio and video transmissions to variable network conditions.
While some of the prior art methods use some information about network conditions, such as measured bandwidth, queue lengths, or packet loss rates to adjust the nature of a real-time data stream, none of those methods contemplate adjusting the content at the source.
Mogul et al., in "Potential benefits of delta encoding and data compression for HTTP," in Proc. SIGCOMM '97 Conference, pages 181-194, ACM SIGCOMM, Cannes, France, September, 1997, discuss the notion that a Web server can choose to compress certain Web data based on bandwidth. However changing the method of transmission of data (compressed or uncompressed) does not change the actual content received at the destination end depending on actual network conditions.
In another paper by Mogul, "Operating Systems Support for Busy Internet Servers," Proc. Fifth Workshop on Hot Topics in Operating Systems (HotOS-V), pages addendum, IEEE TCOS, Orcas Island, Wash., May, 1995, the notion is discussed that a server can include a "hint" in a Web page sent to a client only when the bandwidth is sufficient. The hint can be useful-but-not-necessary meta-information.
More recent prior art is disclosed by Seshan et al. In "SPAND: Shared Passive Network Performance Discovery," Proc. USENIX Symposium on Internet Technologies and Systems, Monterey, Calif., December, 1997, pages 135-146.
In SPAND, a group of geographically co-located client computers measure the cost of retrieving information from remote servers. The cost measurements are reported to a shared database local to the clients. When one of the clients next decides to retrieve something from a remote server, a query is made of the shared database to obtain an estimate of the cost of retrieval. The client uses the information in the database to adjust its retrieval request. The SPAND system can also use a passive network monitor, co-located with the group of clients, to make network performance observations and update the local database.
With SPAND there is no automatic adaptation to network conditions, and SPAND does not envision a choice of source material. In SPAND, all decisions are made at the client. Clients are usually not aware of the full range of available source material at a server, and usually do not have full control over the generation of the source material, as a consequence, SPAND has minimal ability to adapt the source material. In addition, SPAND requires at least one client in a group to make at least one full retrieval before any predictions about network bandwidths can be made.
In addition, SPAND requires a shared database, local to each group of clients. This shared database is problematic for a number of reasons. The shared database requires installation and administration. The clients have to be configured to access the database. The need to contact the shared database increases network costs and latencies experienced by the clients. It is unclear whether the shared database server will scale up to a large numbers of clients, and whether it would represent a potential availability problem.