1. Field of Invention
The present invention relates generally to access of compressed data, and specifically to allowing access of updates to data stored in a compressed file, without requiring that the entire compressed file be updated.
2. Background of Invention
Compressed HTML has become an important standard format for help files for software applications. Software developers use HTML for help files because of the ease with which help information can be written in HTML by non-programmers. However, because today's software programs are often very complex, the amount of help information required can be so extensive that it would be too large to practically distribute all of the help files in uncompressed format.
In order to use compressed HTML for help files, numerous, separate HTML files, each one typically associated with a single help topic, are compressed and compiled into a single file in Compressed HTML (“CHM”) format. The result is a reasonably sized file that includes information on a plurality of help topics. When a user requests help on a desired topic, an HTML help engine (for example, the Microsoft® HTML Help Engine) locates the relevant compressed information in the CHM file and uncompresses it into HTML for display by the user's browser.
For these reasons, the CHM standard is useful for distributing and accessing help data, as well as for large amounts of information on other topics. However, such information often needs to be updated, and in this regard the CHM standard has a substantial shortcoming. While a CHM file typically includes compressed HTML files for a large number of different topics, it is often desirable to update the information for only a small number of topics, or even on a single topic. Because a CHM file is compressed, in order to update even a single topic file within the compressed file, a new CHM file must be compiled that includes not only the updated topic, but all of the unchanged topics as well.
For example, suppose a CHM help file for a commercial software program is five megabytes in size. If the developer wished to update a single topic file only, the updated topic file would likely be very small, perhaps only ten kilobytes. Yet, the developer would have to compile a new CHM file that includes the new topic file, as well as all of the unchanged topic files. Because of the way compression algorithms operate, the new CHM file would be different all the way through. Thus, the developer would have to distribute a new five megabyte file merely to update a single ten kilobyte topic file. If the application is used by thousands or perhaps millions of users (which is common for very popular applications), then even the online distribution of the updated CHM file would be difficult and costly, requiring significant online bandwidth and time. Clearly, this limits developers to updating CHM files only in significant product releases.
The problem with updating CHM files results from the design and operation of the HTML help engine. The HTML help engine is not under the control of the application developer (for example, the Microsoft® HTML Help Engine is part of the Windows 2000 operating system provided by Microsoft Corp). When a developer uses the CHM standard to distribute help topic files, the developer is relying on a third party HTML help engine to locate the topic file in the CHM file associated with the help topic requested by the user. Because of its internal configuration and design, this third party HTML help engine only attempts to locate the topic file in the CHM file. The developer cannot reprogram the help engine to check elsewhere for an updated HTML file.
It will be apparent to one of ordinary skill in the relevant art that this problem applies not only to CHM files used for providing online help to software users, but can apply in general to compressed files that include multiple topic files.
What is needed is an approach that allows access of the most recent version of information on a topic stored in a compressed file accessed by a third party retrieval engine, without requiring that the entire compressed file be updated.