This invention relates to communications networks, generally, and more particularly to a method of communication within such communication networks and apparatuses for practicing the method.
Many emerging environments in the field of computer technology are increasingly facing a problem in that the data and computational requirements of software applications easily outstrip the system resources. The problem is particularly acute in multimedia environments such as intranets, the Internet (a communications network) or the World Wide Web (Web) as well as in many data intensive applications such as online analytical processing (OLAP) and multimedia databases.
The Web is essentially a distributed depository of files. The files are stored on Web servers connected by the Internet. Users of the Web may request transfer of files to their own computers, i.e., Web clients, for viewing, storing or printing. Each server stores files identified by a unique electronic address known as a universal resource locator (URL). A URL points to a particular server and identifies the location of a file on that server. Many of the files stored on Web servers are documents written in a standard programming language known as hypertext mark-up language (HTML). HTML files are translated for viewing, printing or storing by a Web browser (a computer program designed to display HTML files and communicate with Web servers). Using HTML, an author of such a file (Web page) can associate a hyperlink with a specific word, phrase or image in a document. While some files are stored on Web servers in HTML format, other files are available in non-standard formats which may not be translated for viewing, printing or storing by a standard Web browser. While some Web clients may be capable of translating such non-standard formats, others may not be so capable. Due to the vast heterogeneity of client resources, many files are available in formats that can only be translated by a subset of all clients while other files are stored in a plurality of formats to ensure translatability by most clients.
Users may access the World Wide Web in a variety of ways. Some users are fortunate enough to connect to the Internet with dedicated high-speed, high-bandwidth connections (e.g., T1, T3 or ISDN lines). However, many users, particularly users accessing the Internet from their homes, have only dial-up access to the Internet. In a dial-up arrangement, the user""s computer typically has a modem for dialing and communicating with an Internet Service Provider to which the user subscribes. The Internet Service Provider typically maintains a proxy computer which is attached to the Internet via a dedicated communications line. The proxy intervenes between the Web server and Web client as described below.
Requests for file transfer, usually in the form of a GET URL HTTP request including a URL in the World Wide Web context, originate with the client and are forwarded to the proxy via the user""s dial-up connection. The proxy then relays the request over the Internet to the appropriate Web server. The Web server responds by transmitting the requested file to the proxy. The proxy then relays that file to the client.
The user typically accesses files stored on the Web using Web browser software running on a Web client connected to the Internet. Typically, this is achieved by the user""s selection of a hyperlink (typically displayed by the Web browser as an image, or a bold word or phrase) within a document being viewed with the Web browser. Each hyperlink is associated with an electronic address which uniquely identifies the file associated with the hyperlink indicating the file""s location on a Web server. The electronic address is in the form of a URL. A user""s selection of a hyperlink acts as a user""s request for transmission of the file associated with the hyperlink to the client. The Web browser then issues a hypertext transfer protocol (HTTP) request for the requested file to the Web server identified by the requested file""s URL. In response, the designated Web server returns the requested document to the Web browser, also using HTTP, provided that the file identified in the URL is present at the server identified in the URL at the location identified in the URL.
The standard HTML syntax of Web pages and the standard HTTP communications protocol supported by the Web guarantee that a Web browser can communicate with any Web server. The Javascript programming language and Javascript applets provide platform independent application programs over the Internet and the World Wide Web which can be run on any Web client.
Web pages typically are predominantly graphical in nature. The graphical images comprising each Web page are generally much larger in size (bytes) than even lengthy simple text documents. Such large graphics files slow the response time for users of the Web. The delay is referred to herein as latency. Latency is primarily a function of the size of the file transmitted and the bandwidth of the connection over which the file is transmitted. As an ever increasing number of users, both individual and corporate in nature, use the Internet and the Web, response times have even further slowed.
FIG. 1 is a symbolic diagram of a simplified Web topology of the prior art. In the example of FIG. 1, the Web client 6 is the user""s computer. The client may connect to the Internet Service Provider (not shown) over a communications line 10, using a modem in the client 6. The Internet Service Provider typically controls the proxy 16 which has a dedicated connection over a transmission link 20 to the Internet 26. The Internet 26 essentially is a sub-network of switching nodes and transmission links. A Web server 36 is connected to the Internet 26 by transmission link 30. In actuality, the Internet 26 is comprised of numerous servers, clients, proxies, transmission links, etc. between the proxy 16 and the Web server 36.
Using Web browser software running on the Web client C, the user requests an image, document, multimedia or other file (herein referred to collectively as xe2x80x9cfilexe2x80x9d) by submitting a request in the form of a URL. In a typical Web topology, the URL is transmitted to the proxy 16, which then forwards it over transmission link 20 to the Web server 36 via the Internet 26 and transmission link 30. The Web server 36 responds to the request by transmitting the file via the Internet 26 to the proxy 16 which then forwards the file to the client 6 for viewing, storing or printing. In such a Web architecture, the link between the client and the proxy is typically the critical bottleneck, i.e., a low bandwidth connection relative to the bandwidth of network connections between the proxy and the server.
In an effort to reduce latency, Internet Service Providers frequently provide a memory cache on their proxy computers. Generally, the cache is capable of storing a file so that a subsequent requests from the same or a different client for the same file may be fulfilled by the proxy without having to wait for transmission of the file from the server. A proxy 16 having a memory cache 18 is shown in FIG. 1. The cache stores a copy of a file requested by a client 6 and returned by the Web server 36 although it also forwards a copy of the file to the client 6. As referred to above, this has helped in decreasing latencies, from the client""s standpoint, between requesting a file and receiving the requested file. This is achieved by checking the cache 18 for the requested file at the proxy 16 each time a URL request is received from a client 6. In a typical arrangement, if a copy of the file is resident in the cache 18, the file is forwarded to the client 6 provided that the file in the cache 18 is not stale, i.e., the copy in the cache is current because the file has not been updated on the server. Methods for determining whether a cached file is current are well-known in the art. This is determined, in some cases, by comparing the date of the request to an expiration date associated with the file resident in the cache and presuming that the cached file is current if the date of the request is prior to the expiration date. Alternatively, the proxy 16 may query the server to determine whether the copy of the cached file is current, the server 36 responding with a brief message if the cache-resident file is current and with a current copy of the file if the copy in the cache is not current. If the requested file is not in the cache 18, the proxy 16 forwards the URL request to the server 36 over the Internet 26. When the proxy receives the file from the Web server, a copy of the file is stored in the cache 18. Various caching algorithms are known in the prior art to determine which file in the cache to discard to make space for a recently requested file. Other caching algorithms are well-known in the art also.
Generally, a caching algorithm uses a metric representing the utility of caching a file to make such a determination. Typically, the utility metric is initialized when the file is retrieved for the first time and thereafter rises if the file is accessed. Otherwise, the metric falls with the passage of time. The cache-resident file with the least utility is usually discarded first to make space in the cache for a recently retrieved file. The difference between various caching algorithms is in their metrics for utility. In a well-known Least Recently Used (xe2x80x9cLRUxe2x80x9d) LRU algorithm, the utility of a file is proportional to the recency of its last usage. In a more recently proposed GreedyDual-Size algorithm, the utility of a file also depends upon its size (in bytes). See Cao and Irani, xe2x80x9cCost-Aware WWW Proxy Caching Algorithmsxe2x80x9d, Technical Report CS-TR-97-1343, University of Wisconsin, Madison, May 1997. However, prior art caching methods have been only minimally helpful in abating the latency problem since one of the causes of latency is the low bandwidth link between the proxy and the client which cached files must traverse.
As a result of latency, users are likely to welcome an option to retrieve a lower quality, smaller version of a file, provided that the file can be provided relatively quickly, i.e., with savings in latency. A current approach employed by many Web browsers enables a user to select, prior to a request for any particular file, to (1) receive all images at full resolution; or (2) reject all images and receive only text. In U.S. Pat. No. 5,764,235 to Hunt et al., a method is, provided for receiving graphical images at predetermined resolution as selected and preset by the user. This method requires selection of a resolution or version prior to a request for any particular file.
Many users would likely choose to reject some image or document files, receive a lower resolution version of others fairly quickly, and would be willing to wait longer for a higher resolution version of others. In addition, apart from latency issues, some users requesting a file would like to select a version of the file in a format that is translatable by their Web clients, due to the resources available to the user""s Web clients. Since the various versions of a file are related in that they share some or similar content it would be desirable to provide a single, logical point of access to a plurality of versions of a file.
Accordingly, it is an object of the present invention to provide a method of multireslolution allowing a user to receive a user-selected version of a file.
It is another object of the present invention to allow the user to select a version on a per-file basis for each file requested.
It is a further object of the present invention to provide a multiresolution engine that derives the user-selected version of the file from another readily accessible version when possible and the user-selected version is not readily accessible.
It is yet another object of the present invention to provide a method allowing access to a plurality of versions of a file from a single logical point of access.
It is yet a further object of the present invention to provide a deriving computer having a cache which is multiresolution-aware.
It is yet a further object of the present invention to provide a method for systematic multiresolution which requires no modifications to standard Web browser software, Web servers, or common communications protocols.
It is yet a further object of the present invention to provide a communications network in which the server and proxy are multiresolution aware.
These and other objects are realized by the provision of an apparatus and method by which a user of a communications network may request and receive files over the network in various user-selectable formats and/or resolutions; i.e., versions. A user working at a client computer (e.g., a computer running client software) on the network can request the transmission of a file such as through the selection of a hyperlink on a web page currently being viewed. A user first determines information content the user wishes to receive. In accordance with the present invention, the user selects a logical link to such content and is presented with a menu of versions of files containing that information content. The user selects a version from the menu. The desired version is transmitted to the client in the usual fashion if the version is materialized (resident) on a server computer or intermediate proxy computer. If the desired version is not materialized, the desired version is automatically derived from an appropriate materialized version.
In a preferred embodiment, the client generates a menu of versions of the requested file which the user may choose to receive upon selection of a hyperlink serving as a single logical access point to all versions appearing as options on the menu. The available versions comprise all versions of the file that are resident at the server corresponding to the requested file or an intermediate proxy as well as versions of the file that can be automatically derived from the resident version(s).
The multiresolution engine for selectively providing a user-selected version may be resident on a proxy rather than on the servers or clients. This embodiment minimizes the number of computers that require the specialized apparatus for providing multiresolution capabilities in accordance with the present invention.
Further, the proxy may comprise a cache for storing one or more versions of a file. The proxy also may comprise programs for determining which versions are derivable from versions stored in its cache so that it need not use up cache space storing versions that can be derived. If a requested version of a file is neither available nor derivable from a version available in the cache, then the proxy will retrieve an available version of the file from the corresponding server and either forward it to the client or derive the requested version from the retrieved version and forward the derived version to the client.
In an advantageous embodiment of the invention, the proxy also may include programming for (1) determining whether the version(s) of a file resident in the cache are current, (2) running a caching algorithm for determining which files to delete from the cache when new files must be added and (3) transmitting to clients the program for generating the version selection menu.