This invention relates generally to the installation and updating of computer software products, and more particularly to the downloading of update data needed for updating a software product or components thereof.
Most popular software products nowadays constantly go through revisions to fix xe2x80x9cbugsxe2x80x9d or add new features and functionality. To that end, each revision of a software product or component may require the addition of new files and/or the replacement of existing files with files of newer versions. Once a vendor has isolated a software product problem and created a solution for the problem, it would want to put that fix into an update and make the update widely available to the customers. Software vendors have a business incentive to distribute software updates to customers as quickly and trouble-free as possible.
The Internet provides an important channel for customers to obtain the latest updates for software products. The explosive growth of Internet usage has created a common expectation by customers that software products and updates be provided online for downloading. It is also in the interest of software vendors to promote the use of the Internet to distribute updates, because it reduces their costs and allows customers to obtain the fix for an identified problem as soon as the fix is made available for downloading. The vendor sites on the Internet can be designed to make it very simple to discover and locate update files for an application. The technical aspects of file download have mostly disappeared from the user""s view, and are now typically handled by the operating system.
In a conventional approach, a software vendor constructs a software update as a xe2x80x9cpackagexe2x80x9d for download. This package is typically a self-extracting executable file with the setup program and each of the product""s updated files embedded and compressed to make the package smaller. The size of the package is generally the sum of the compressed sizes of each changed file, plus the size of the extraction code itself. Upon execution, the package extracts each of the contained files to a temporary location, then starts the setup program to install each file to a proper location in the system""s directory. Files that are shipped in a compressed form are decompressed as they are installed. Any existing file of the same name in the same location would simply be overwritten by the replacement file.
Even though the Internet makes wide and quick distribution of software updates possible, the limited bandwidth of network transmission has caused problems. The sheer sizes of common software applications have caused the download sizes of updates to become unreasonable large. Usually a multitude of fixes for a variety of problems of a product will be grouped into an update. If a vendor updates a software product on a regular basis, the download size of the update package will continue to grow, because the vendor cannot omit files under the assumption that the user already has those files from earlier updates. Because the update package combines a number of whole files, it may be quite large even when the files are compressed. Sometimes, even on the fastest modem connections, several hours are needed to obtain the update for a single product.
The time-consuming aspect of the conventional downloading process is, of course, undesirable. In some cases, customers pay long-distance or connection time charges during these file downloads. Any reductions in connection time will reduce the direct monetary cost for these customers. The vendors typically also have some distinguishable costs relating to the sizes of downloads they provide, so reducing the sizes may give them direct monetary benefits as well. Reducing the sizes of downloads will increase their available network bandwidth, allowing them to serve more customers with existing network server equipment.
The long time it takes to download a large update also makes the downloading process more vulnerable to various network connection problems. There are a number of reasons why an Internet session might be disconnected prematurely, including telephone line noise, call-waiting signals, and unintentional commands. Some Internet service providers enforce a connection time limit, limiting the amount of time the user can be on-line in a single session. If the user is downloading a large file when the network connection is cut off, they may have to start over. Most common operating systems and file transfer protocols do not allow the file transfer to be resumed, so any interim progress would be lost, and the transfer would have to be restarted. The opportunities for failure are so numerous that many users find it nearly impossible to obtain the update online. If the size of an update package is too large, they may never be able to completely download it.
Another significant drawback of the conventional update downloading approach is that it can be fairly inefficient. Many downloaded files are actually never used for updating the software product. Larger software applications frequently have a wide variety of installation options, and very few customers will actually use all of these options. Some examples include spell checkers, document templates, and assistance features for the visually impaired. Another example of common installation option relates to drivers for printers. Most users will need only one or two printer drivers out of a collection of hundreds. Since the vendor has no way of knowing in advance which options will be needed, it would normally include the fixes for all product options in the update package. At setup time, the setup program will recognize that certain files do not need to be installed, so some of the data that was downloaded will be discarded. Since some software products, such as the spell checker, share files with other products, it is possible that the customer will already have installed one or more of the updated files on the system. Again, some of what was downloaded (the spell-checker, in this case) will be discarded.
More recently, vendors have begun to utilize binary patching techniques to update older versions of files into their new forms. The changes needed to modify an existing file into a new form are detailed in a xe2x80x9cpatch.xe2x80x9d Usually, itemizing the changes needed to alter an existing file will take significantly less space than the entire new file would. Data compression techniques will frequently reduce executable files by a ratio of about 3:1, proportional to the original file size. In comparison, the latest file patching techniques achieve ratios more closely proportional to the size of the changed contents, and patching xe2x80x9ccompressionxe2x80x9d ratios between 10:1 and 100:1 are common.
To utilize patching for software updates, the vendor must be aware of which versions of files have already been distributed. Most patching tools will accept multiple xe2x80x9coldxe2x80x9d file versions as input, and produce a patch that is usable on any of those versions processed. The patch, however, cannot be used to convert a version that is not included in the input for generating the patch. The patch produced for multiple older versions will be larger than a patch prepared for only one of the older versions.
A download package that exploits patching is typically an executable file with the setup program and a patch for each of the product""s updated files embedded. The size of the update file to be downloaded is generally the sum of the sizes of each patch file plus the size of the extraction code itself. Upon execution on a customer""s computer, the setup program reconstitutes each of the updated files by combining the existing files on the customer""s computer with the corresponding patch data. The included setup program then installs each reconstituted file to the proper locations in the system""s directory structure. Patches, of course, cannot not be used to update files that have not been previously shipped to the customer or somehow are not found on the customer""s system, and the full copies (which may or may not be compressed) of such files have to be downloaded. An update package containing mostly patches and few or no complete files can potentially be significantly smaller than a package with most full files. A patch package may thus require considerably less time to download as compared to conventional update packages.
The patching download approach, however, still has many of the other deficiencies of a full download. Moreover, it introduces a few new, and much more serious, opportunities for failure. The additional risks in patching download come from the need to try to anticipate, at the time the package is constructed, which versions of which files will be present on a customer""s system. If the vendor has shipped multiple versions, interim releases, test fixes, or previous update packages, then every revision of an existing file should be considered in preparing a patch. If the customer""s system contains a version of a file that was overlooked by the vendor (for example, an update that was subsequently produced for another problem), the customer will discover only after downloading the package that one or more of the patches cannot be applied. If the vendor has not included any provision to deal with this scenario, the customer may end up running an untested combination of programs. For an operating system update, the user may not even be able to restart their machine to try another update. For many customers, this risk may outweigh any benefit of implementing the update.
Thus, supplying every prior revision of each file of the software product to the patch generator appears crucial to avoid the patch-mismatching problem. Careful tracking procedures can be used to make sure no revisions are missed from the from the update package. The size benefit of a patch download, however, can dissipate quickly if the vendor attempts to include patching data for all earlier versions of the files of the software product. Each additional prior version supplied to the patch generator will cause the patch size to increase. For instance, an operating system may have thirty major service packs and a thousand minor updates supplied over its lifetime. The patching package may become so large that it would be better to ship that full file in compressed form instead, thus defeating the purpose of using binary patching in the first place.
The patch download approach, like the full-file download approach, is also not satisfactory in terms of efficiency and reliability. When a patch file contains change information for multiple revisions, it will be larger than it would be for any one of those revisions. The difference in size is recognized as additional downloaded data that will be discarded. Patches for options that might not be installed and patches for shared files that might already be installed must be supplied. The downloading of a patching package is also subjected to all the connection problems experienced by the full-file download approach. In short, a patching download has many of the deficiencies of a full download, except possibly the reduced download size. The added possibility of errors due to file-patch mismatch, however, may make this approach unacceptable to many users.
Thus, there is a great need for a more efficient and robust way to download update data for installing a revised software product.
In view of the foregoing, the present invention provides a method and system of downloading update data for installing a software product on a client computer that minimizes the amount of data to be downloaded by downloading only those files needed to update the client computer. In the beginning of the downloading process, the client computer obtains from a setup server an initial setup package, which includes a setup program and a list of files required for installing the software product on the client computer. The setup program running on the client computer determines whether some current or earlier versions of those files required for installation already exist on the client computer, and compiles a download request with a list of files needed for updating the client to provide the required installation files. The download request is automatically sent to a second server (which may be the same as the setup server) that stores a collection of update data, such as files and patches. The second server, in response to the request, prepares update files corresponding to the requested files and downloads them to the client. The downloaded files may or may not be exactly the requested files. Using the downloaded files, the setup program updates the existing files to create the set of installation files for the revised software product on the client computer. The revised software product is then installed on the client computer.
Additional features and advantages of the invention will be made apparent from the following detailed description of illustrative embodiments, which proceeds with reference to the accompanying figures.