1. Field of the Invention
The present invention relates to data communication networks and to software applications suitable for use in such networks. More particularly, the present invention relates to an apparatus and method to facilitate incremental updating of program code.
2. The Background Art
As is known to those of ordinary skill in the art, the Java(trademark) language is an object-oriented language developed by Sun Microsystems, Inc. that can be integrated into conventional Hypertext Markup Language (xe2x80x9cHTMLxe2x80x9d) browsers, and which allows a document server to provide the browser with documents as well as with executable code. The executable code can be automatically loaded from the document server if the HTML browser determines that it does not have the appropriate code already resident on the user machine.
Typically, the executable code takes the form of application programs known as xe2x80x9cappletsxe2x80x9d comprising xe2x80x9cbytecodesxe2x80x9d that are machine independent. These applets are then interpreted by operating system specific applet interpreters (virtual machines). For example, a current Internet/Web browser implementation using the Java(trademark) language is the HotJava(trademark) browser, also developed by Sun Microsystems, Inc.
The platform-independent nature of Java(trademark) class files allow developers to write a single version of their applet or application, and then to deploy the applet or application on a wide variety of different hardware and operating systems. Moreover, the Java(trademark) platform implements a very advanced security model. According to this security model, a user can run untrusted Java(trademark) applets and applications and be certain that the integrity of his or her system and personal data is never compromised. For example, as is well known, a Java(trademark) applet or application may be run in a xe2x80x9csandboxxe2x80x9d that prevents it from causing any harm or from gaining access to private information stored on a user""s system or local network.
As mentioned above, a common way of deploying Java(trademark) applications across a network is by using Java(trademark) applets. Applets are typically downloaded and executed by a Java(trademark)-enabled web-browser, and make it possible to deploy Java(trademark) software over the web with no installation needed by the user.
A Java(trademark) program (either an applet or an application) is composed of a number of classes and interfaces. Unlike many programming languages, in which a program is compiled into machine-dependent, executable program code, Java(trademark) classes are compiled into machine independent bytecode class files. Each class contains code and data in a platform-independent format called the class file format. The computer system acting as the execution vehicle contains a program called a virtual machine, which is responsible for executing the code in Java(trademark) classes. The virtual machine provides a level of abstraction between the machine independence of the bytecode classes and the machine-dependent instruction set of the underlying computer hardware. A xe2x80x9cclass loaderxe2x80x9d within the virtual machine is responsible for loading the bytecode class files as needed, and either an interpreter executes the bytecodes directly, or a xe2x80x9cjust-in-timexe2x80x9d (xe2x80x9cJITxe2x80x9d) compiler transforms the bytecodes into machine code, so that they can be executed by the processor. FIG. 1 is a block diagram illustrating a sample Java(trademark) network environment comprising a client platform 102 coupled over a network 101 to a server 100 for the purpose of accessing Java(trademark) class files for execution of a Java(trademark) application or applet.
In FIG. 1, server 100 comprises Java(trademark) development environment 104 for use in creating the Java(trademark) class files for a given application. The Java(trademark) development environment 104 provides a mechanism, such as an editor and an applet viewer, for generating class files and previewing applets. A set of Java(trademark) core classes 103 comprise a library of Java(trademark) classes that can be referenced by source files containing other Java(trademark) classes. From Java(trademark) development environment 104, one or more Java(trademark) source files 105 are generated. Java(trademark) source files 105 contain the programmer-readable class definitions, including data structures, method implementations and references to other classes. Java(trademark) source files 105 are provided to Java(trademark) compiler 106, which compiles Java(trademark) source files 105 into compiled xe2x80x9cclassxe2x80x9d files 107 that contain bytecodes executable by a Java(trademark) virtual machine. Bytecode class files 107 are stored (e.g., in temporary or permanent storage) on server 100, and are available for download over network 101.
Client platform 102 contains a Java(trademark) virtual machine (xe2x80x9cJVMxe2x80x9d) 111 which, through the use of available native operating system (O/S) calls 112, is able to execute bytecode class files and execute native O/S calls when necessary during execution.
Java(trademark) class files are often identified in applet tags within an HTML (hypertext markup language) document. A web server application 108 is executed on server 100 to respond to HTTP (hypertext transport protocol) requests originating from a web client (also called a xe2x80x9cweb browserxe2x80x9d) 113 on client 102 containing URLs (universal resource locators) to HTML documents, commonly referred to as xe2x80x9cweb pages.xe2x80x9d When a browser application 113 executing on client platform 102 requests an HTML document, such as by forwarding URL 109 to web server 108, the browser automatically initiates the download of the class files 107 identified in the applet tag of the HTML document. Class files 107 can be downloaded from the server and loaded into virtual machine 111 individually as needed.
A Java(trademark) archive (xe2x80x9cJARxe2x80x9d) format (also known as a xe2x80x9cjarxe2x80x9d format) has been developed to group class files together into a single transportable package known as a JAR file. As is known to those of ordinary skill in the art, JAR files encapsulate Java(trademark) classes using an archived, compressed format. A JAR file can be identified in an HTML document within an applet tag. When a browser application reads the HTML document and encounters the applet tag, the JAR file is downloaded to the client computer and decompressed. Thus, a group of class files (typically, several dozens of them) may be downloaded from a server to a client in a single download transaction. After downloading and decompressing, the archived class files are available on the client system for individual loading as needed in accordance with standard class loading procedures. The archived class files remain subject to storage inefficiencies due to duplicated data between files, as well as to memory fragmentation due to the performance of separate memory allocations for each class file.
The smallest distribution unit for a Java(trademark)-based application is a class file. A class file is a self-contained unit that describes all information about a single class or interface. As mentioned above, a Java(trademark)-based application may consist of hundreds of class files and a set of other resources, such as images, resource bundles, property files, and the like. And, as mentioned above, a JAR file is a standard and convenient method of packaging a Java(trademark)-based application. Conceptually, a JAR file is a compressed archive that contains a set of class files and other resource files. Also, a JAR file contains a special directory, META-INF, which can be used to store meta-information about an application. For instance, as will be described in more detail later, the META-INF/manifest.mf entry is a text file that can contain an attribute that describes the main class of an application.
Packaging an application up into one or more JAR files has several benefits, especially when downloading code via a data communication network such as the Internet. First, downloading a JAR file using a single HTTP request is vastly more efficient than downloading each individual entry in the JAR file by itself. Second, class look-up is much more efficient if all application resources are JAR files, since unnecessary network access can be prevented. Third, an application developer can ensure predictable performance. For example, if each class file is downloaded on demand, a broken network connection may cause an application to lose the ability to display an error message, since the error class may not have been downloaded. Finally, as is known to those of ordinary skill in the art, a JAR archive file is the smallest unit that supports code signing.
One disadvantage of using JAR files that is known to those of ordinary skill in the art is that updating an application will typically require large downloads, since the entire JAR file must be replaced. A bug fix or other improvement to an application might only require changes in a few classes (which would typically be in the order of kilobytes in size), but due to the currently known packaging of applications into JAR files, the user would be required to download a completely new JAR file (which would typically be in the order of Megabytes in size).
Thus, what is needed is an apparatus and method that provides all the advantages of archive files such as JAR files, but which is capable of supporting incremental code updates, so that only the changes need to be transmitted to a user, instead of requiring that a completely new archive file be transmitted.
According to aspects of the present mechanism, an original archive file having one or more entries is created, where each entry in the original archive file is itself a file, and where each entry in the archive file may comprise any file type, including an archive file. The original archive file is transmitted to a client computer. Subsequently, a target archive file is created, wherein one or more of the entries in the target archive file are typically expected to be identical to one or more entries in the original archive file. Given the original archive file and the target archive file, a difference archive file is created. The difference archive file comprises an index file describing the changes between the original archive file and the target archive file, and also comprises a set of entries corresponding to the entries in the target archive file that are not contained in the original archive file. The difference archive file is transmitted to the client computer, instead of requiring that the entire target archive file be transmitted. At the client computer, the difference archive file is applied to the original archive file to produce a synthesized archive file, wherein the synthesized archive file is functionally identical to the target archive file, and wherein each entry in the synthesized archive file is identical to a corresponding entry in the target archive file.