Web browsers are software applications that enable users to download and view files from the many servers that make up the distributed network known as the World Wide Web. Generally, files are requested by specifying a Uniform Resource Locator (URL), which includes a particular file reference and a particular location (domain and path) on the network.
State-of-the art browsers provide file caching features which store recently downloaded files locally. As a user visits many different web pages or “surfs” the Web, hundreds of files may be cached. Some will be dynamic files that contain Hypertext Markup Language (HTML) that represents a web page that is frequently updated. The majority of files on the Web, however, are static files, whose content does not change. Examples include files representing graphic images that may be referenced by HTML files, for example, the banner advertisements that appear on various web pages on the Web. Cached files are stored locally under random filenames, assigned by the browser, which maintains a mapping file to map each filename to a corresponding URL. If a URL that corresponds to a cached file is again requested by the browser, the file can be retrieved much faster from local storage without resort to the network.
URL's are an example of location-specific file referencing techniques, which are used in virtually all known file retrieval systems. Location-specific file referencing systems specify files by their location, rather than by their contents. Since location-specific file references provide no specific indication of file contents, file referencing techniques that employ them must verify file contents by inference. This imposes limitations on the efficiency and dependability of file retrieval systems.
In browser caching systems, for example, since files are specified by their location, a browser is incapable of determining whether a cached file is current without comparing the cached file, or at least the attributes of the cached file, to the network copy of the file. To ensure that a file is current, browsers typically perform some type of verification that the contents of a particular cached file are identical to the contents of the file currently existing at a particular location on the network. In known caching systems, verification is usually done by comparing the time stamp of a cached file with the time stamp of the server copy of the file. In web browsers, this results in increased response times and network load.
Some efforts have been made to improve efficiencies of file caching systems, especially with regard to Internet browser applications. These efforts are exemplified in U.S. Pat. No. 5,864,837 to Maimone and U.S. Pat. No. 5,864,852 to Luotenen.
Maimone discloses methods and apparatus for verifying that cached copies of requested data objects are up-to-date using content-based signatures associated with cached and latest version (server) copies of requested data objects. The respective signatures are compared to determine whether the content of a cached copy of a data object is the same as the content of a server copy of that data object. Maimone suggests using checksums, message digests or hash functions to generate relatively unique numbers that define these signatures. Maimon's method of file verification requires the additional steps of generating, at both the client and server, content-based signatures. Thus, the operation of both the server and the client must be modified to incorporate Maimon's technique.
Luotenen describes a proxy server caching mechanism that generates a fingerprint based on an input URL. The disclosed system provides a way of organizing a cache using small entries that contain enough information about the URL associated with each cache file that the actual cache files need not be opened to accomplish discriminatory cache cleanup.
Mogul and van Hoff, in their publication entitled “Duplicate Suppression in HTTP” describe a technique for reducing the duplication of content; i.e., logos, backgrounds, bars, buttons, etc., in retrieved HTML and text documents on the Web. Using the technique of Mogul and van Hoff, any response whose message digest is equivalent to the message digest of the requested resource may be substituted. A proxy may check its cache to see if a cached instance of the resource has the identified message digest and, if it does, returns the cached resource to the client. To accomplish duplicate suppression, Mogul and van Hoff introduce an entity tag—“SubOK”—to be used in an HTTP “GET” request. The “SubOK” tag modifies a standard “GET” request such that a response whose message digest is the same as that specified in the “SubOK” field may be substituted for the resource specified by the “GET” command. Thus, the technique proposed by Mogul and van Hoff requires a transfer protocol that is a modified extension of a standard protocol that must reside on both the client and server. Moreover, their method requires the additional step of determining the message digest of the requested resource before substitution can occur. This additional step prevents back-compatibility of the technique of Mogul and van Hoff with existing software.
Despite past efforts, such as those of Maimone, Luotenen and Mogul and van Hoff to provide efficient caching systems, known systems still suffer from many of the limitations imposed by the use of location-specific file references. It would therefore be desirable to provide file referencing systems that do not inherently possess the limitations imposed by location-specific file references. It would further be desirable to provide caching systems that overcome the aforementioned inefficiencies associated with location-specific file referencing systems.
Like known caching systems, conventional software installation systems also suffer from the aforementioned limitations imposed by location-specific file referencing systems. Software installation typically involves copying a large number of computer files from a source medium to a target medium. Some of the files on the source medium may already exist on the target medium from previous installations of earlier versions of the software. Thus, some files may be unnecessarily copied during the installation process. Moreover, some previously installed files on the target medium may have filenames that are identical to filenames on the source medium, but the respective contents of these files may be different. Installation of the new files may therefore compromise the function of existing software that depends on the overwritten file. Known systems address this problem by comparing time stamps or other readily available parameters. These circumstances lead to inefficiencies in known software installation systems. It would therefore be desirable to provide a system installing new software files in such a way that eliminates the potential for overwriting files that are necessary to the function of existing software.
Another problem that characterizes known software installation systems is the necessity for a user to know in advance of the installation procedure, which features of a software application he or she will require. If a user later desires features which were not installed, he or she must re-install the software application and select those desired features. This requirement stems from the location-specific characteristics of known installation systems. For example, under the “WINDOWS” operating system, when a software package has been installed on a user's computer, the computer's Program Manager or Start Menu is provided with links (or shortcuts) to various components of the software for the purpose of starting execution of the software when a user selects an icon. In addition, numerous internal links are stored by the operating system to Dynamically Loadable Libraries (DLLs) and other ancillary files essential to the operation of the installed software. These file links are typically represented internally as conventional textual file references, i.e. file path and name. They are therefore location-specific. Because operation of the software is dependent on the stored links, and the stored links are location-specific, any change in the location of the software files will render the software non-functional. Since links are written to the operating system registry during software installation, a user must predict at installation which features of the software will be needed in order that the installation program may write the appropriate links to the operating system registry and make local copies of all files that are necessary for these features. If at a later time the user wishes to use other features of the software, or remove certain features in order to free local storage space, it is usually necessary for the user to re-install the software so that the operating system registry is updated to include location-specific links to the newly added support files.
It would therefore be desirable to provide a software installation system in which a user may access to all features of a software application, without having to initially install all of the software files on local storage. In particular, it would be desirable to provide a software system in which a large-capacity (but perhaps slower) storage location such as a remote server is used to store all of the software application files and provide for automatic caching of those files in local storage as necessary when a user desires selected features of the software. This would permit the user to have access to all features of the software without wasting local storage space on infrequently used files or files that are no longer used.