Conventional data compression techniques use a compression engine that accepts one file as input and produces a compact version of that file as output. A corresponding decompression engine performs the inverse function, accepting the compact form as input and reconstructing the original file for output on the destination computer.
Differential compression is a different technique. It takes two files as input: a target file and a “basis” file, which is usually an older version of the target file. The compression engine determines the differences between the basis file and the target file and creates a compact “delta” file as output. On the destination computer, the decompression engine takes the existing basis file and the compact delta file as input and creates the target file as output. This is known as “applying the delta file to the basis file”. If the basis file and the target file are very similar, the size of the delta file will be very small, generally much smaller than the file that results from simply compressing the target file conventionally. The size of the delta file is proportional to the number and nature of differences between the basis file and the target file.
The goal of a content delivery scheme is to produce a particular set of target files at a consumer's computer. Throughout, the term “consumer” is used to refer to the consumer of the content, and does not imply any monetary transaction. A content delivery scheme may be used, for example, when a software vendor releases a new product or a software upgrade, or has determined new virus signatures, spam rules, advertisement blocking rules, etc. The term “computer” not only includes mainframes, servers and personal computers (e.g., desktop, laptop and notebook computers), but also other devices capable of processing data, such as PDAs (personal digital assistants), mobile telephones (e.g. smartphones), set-top boxes, gaming consoles, handheld gaming devices, and embedded computing devices (e.g. computing devices built into a car or ATM (automated teller machine)).
A content delivery solution involves delivery to the consumer's computer of files and information necessary to produce the target files at the consumer's computer. Delivery of the files by the content provider or a third party may be, for example, via network transmission or using a physical medium such as a diskette, a compact disk or other physical medium. The files may be any kind of file, whether data, code, a document, a spreadsheet, a drawing, music, or something else.
For example, if there are three target files FileA, FileB and FileC, one solution is to create a conventional archive containing a single copy—possibly compressed—of each of these files, deliver the archive to the consumer's computer, and produce the target files by extracting—and if appropriate, decompressing—the contents of the archive at the consumer's computer. A non-exhaustive list of examples of conventional archives includes: WinZip®archives, “MICROSOFT®” CAB (cabinet) archives, TAR archives, GNU zip (GZIP) archives, bzip2 archives, RAR archives, and Java archives (JAR).
If one can assume the presence of an earlier version of each of these files at the consumer's computer, another solution is to create a delta archive containing the delta files that encode how each target file differs from its earlier version, deliver the delta archive to the consumer's computer, and produce the target files by extracting the contents of the archive and applying the delta files to the earlier versions to synthesize the target files at the consumer's computer.
Yet another possibility is to create an intra-package delta (IPD) package, as described in U.S. Patent Application Publication No. US 2005/0022175 to Sliger et al., published Jan. 27, 2005 and which is incorporated herein by reference. For example, this IPD package may contain a compressed copy of FileA, a delta file Δ(A→B) that encodes how FileB differs from FileA, and another delta file Δ(A→C) that encodes how FileC differs from FileA. The solution is to create this IPD package, deliver it to the consumer's computer, and produce the target files at the consumer's computer by extracting and decompressing the compressed copy of FileA, extracting the delta file Δ(A→B) and applying it to FileA to synthesize FileB, and extracting the delta file Δ(A→C) and applying it to FileA to synthesize FileC. Since there is an internal delta dependency, FileA must be produced before either of FileB or FileC can be produced. The order in which FileB and FileC are synthesized is not important in this example.
Obviously many other solutions are also possible. For example, another solution is to create an IPD package that contains a compressed copy of FileB, a delta file Δ(B→A) that encodes how FileA differs from FileB, and the delta file Δ(A→C). This solution includes delivering the IPD package to the consumer's computer, and producing the target files at the consumer's computer by extracting and decompressing the compressed copy of FileB, extracting the delta file Δ(B→A) and applying it FileB to synthesize FileA, and extracting the delta file Δ(A→C) and applying it to FileA to synthesize FileC. Due to the internal delta dependency, FileB must be produced first, then FileA and then FileC.
Yet another solution is to create what can be referred to as an extra-package delta (XPD) package, which is described briefly in U.S. Patent Application Publication No. US 2005/0022175. An XPD package differs from an IPD package in that at least one of its target files is produced by applying a delta file in the package to a basis file that is external to the package. For example, if one can assume the presence of an earlier version of FileC at the consumer's computer, the XPD package may contain a compressed copy of FileA, a delta file Δ(C→B) that encodes how FileB differs from FileC, and a delta file Δ(Cold→C) that encodes how FileC differs from its earlier version. The solution is to create this XPD package, deliver it to the consumer's computer, and produce the target files at the consumer's computer by extracting and decompressing the compressed copy of FileA, extracting the delta file Δ(Cold→C) and applying it to the earlier version of FileC to synthesize FileC, and extracting the delta file Δ(C→B) and applying it to FileC to synthesize FileB. Due to the internal delta dependency, FileC must be produced before FileB. FileA may be produced at any time independent of the production of the other target files.
If one can assume the presence of an earlier version of FileC at the consumer's computer, a further solution is to create an XPD package that contains the delta file Δ(Cold→C), a delta file Δ(C→B) that encodes how FileB differs from FileC, and a delta file Δ(Cold→A) that encodes how FileA differs from the earlier version of FileC. The solution is to create this XPD package, deliver it to the consumer's computer, and produce the target files at the consumer's computer by extracting the delta file Δ(Cold→C) and applying it to the earlier version of FileC to synthesize FileC, and extracting the delta file Δ(C→B) and applying it to FileC to synthesize FileB, and extracting the delta file Δ(Cold→A) and applying it to the earlier version of FileC to synthesize FileA. Due to the internal delta dependency, FileC must be produced before FileB. FileA may be produced at any time independent of the production of the other target files.
Although conventional archives, delta archives, IPD packages and XPD packages are all used in content delivery schemes, they differ in many respects. Some (conventional archives and IPD packages) include all the files needed to produce the target files (i.e. are self-contained), while others (XPD packages and delta archives) do not. Some (IPD packages and XPD packages) have internal delta dependencies, while others (conventional archives and delta archives) have no internal delta dependencies. Moreover, their formats, their authoring tools and the tools for expanding them, are different.
If using a conventional archive or a delta archive, the decision of which files to include in the archive for a given set of target files is trivial. If using an IPD package or an XPD package, the task of determining which delta files to create and which files to include in the package for a given set of target files is not trivial. U.S. Patent Application Publication No. US 2005/0022175 describes a method for determining which delta files to create in order to obtain the smallest IPD package.
When determining which content delivery solution to use, the content provider's options are limited by the content delivery scheme authoring and expansion tools that are available, the computational resources available to the content provider and the consumer, bandwidth and time-to-deploy considerations for the delivery of the files, and the restrictions of the particular archive or package format chosen.