The present invention relates generally to computer software, and specifically to object-oriented computer applications.
In transmitting applets and applications over the Internet and other low-bandwidth networks, download time can be a crucial factor. When such software takes too long to download, the user receiving the software will at best be dissatisfied, and may abandon the matter entirely. Common applets and applications, however, typically use many class files, along with other resources, such as image and audio files. In the Web browsing environment, each file to be transferred requires its own Hypertext Transfer Protocol (HTTP) transaction. In order to reduce the time needed to download an applet or application from a server to a client, it is important both to limit the number of files transferred and the total volume of data in the files.
For Java(trademark) applets and applications, the Java Archive (JAR) provides a platform-independent file format that aggregates many files into one and can thus be used to reduce the number of HTTP transactions required for download. Multiple applets, including both their requisite class files and other resources, can be bundled into a single JAR file. The JAR file can then be compressed in order to reduce the data volume to be downloaded. At the client end, Web browsers with Java support are able to decompress and open the JAR file and then to run the applet or application that it contains. The JAR format also supports package sealing and electronic signing of the JAR contents, using a manifest file, which is placed at the beginning of the JAR file and lists the files present in the archive. These and other aspects of the JAR format and its use are described at http://java.sun.com/products/jdk/1.2/docs/guide/jar/.
A major component in nearly any Java class file is the constant pool, which is a table containing symbolic and string information, such as variable names, method names and signatures, and field names. Almost all other data structures in the class file contain references (indices) to entries in the constant pool. In a single applet or application, composed of multiple class files, the same values typically appear in certain entries in the constant pools of many of the classes. Furthermore, many elements of the constant pools of the different classes typically have the same semantic content, such as names of methods, fields and other classes, although they may be syntactically different in the compiled byte code. In other words, entries in the constant pools belonging to different classes of the applet or application may share the same name in the class source code (common semantics), while containing different values in the byte code due to the differences in use of the named entries (differences of syntax) in the different classes. There is thus a great deal of redundancy in the contents of the constant pools. A particular example of the redundancy that normally occurs in constant pools is presented below in Tables I and II, in the Detailed Description of Preferred Embodiments.
U.S. Pat. No. 5,966,702, whose disclosure is incorporated herein by reference, describes a method for pre-processing and packaging class files that addresses the problem of syntactic redundancy among the class files. During pre-processing, each class file in a set of class files is examined to locate duplicate information in the form of redundant constants (Integer, Double, Utf8 and Long entries) contained in its constant pool. The duplicate constants are placed in a separate shared table, and all occurrences of such constants are removed from the respective constant pools of the individual class files. After removal of these shared constants, the individual class files are left with reduced constant pools. The class files and the shared table are packaged as a unit in a multi-class file, which is typically downloaded to a client (in a manner similar to the above-mentioned JAR file).
To run the set of class files at the client side, the Java Virtual Machine (JVM) must resolve constant references to determine whether to read the constant values from the shared table or from the reduced constant pools of the individual classes. A modification to the JVM is required (relative to the standard JVM that is currently used in standard Web browsers) in order to perform this sort of constant resolution. U.S. Pat. No. 5,966,702 makes no provision for reconstructing the original classes from which the multi-class file was constructed. Furthermore, although the method of this patent eliminates duplication of syntactically-identical constants, by moving them to the shared table, it makes no attempt to deal with constant pool entries that are syntactically different but semantically identical.
Preferred embodiments of the present invention provide improved methods, systems and software products for reducing the volume of data that must be transmitted in conveying a set of class files over a network. In these preferred embodiments, the constant pools of at least some of the classesxe2x80x94and preferably, all of the classesxe2x80x94are consolidated into a global constant pool. In the course of this consolidation, multiple semantically-identical entries occurring in the different constant pools of the individual classes are replaced by a single entry in the global pool. References to the constant pool entries in the different classes are accordingly replaced by references to the corresponding entry in the global constant pool. The replacement of entries and references takes place regardless of whether or not the multiple entries are syntactically identical in their individual occurrences in the different classes.
In some preferred embodiments of the present invention, the class files comprise Java classes of an applet or application, which are packaged together in a JAR file in compliance with Java standards. A mechanism is added to the JAR file to reconstruct the individual constant pools out of the global constant pool in the JAR file. This mechanism may be implemented either by appending a new class to the set in the JAR file or by modifying one of the existing classes in the file. After the JAR file has been downloaded to a client, this mechanism reconstructs the application or applet automatically, in a manner transparent to the JVM at the client side, as a first step in installing the application or running the applet.
Thus, by eliminating semantic redundancies in the constant pools, and not only syntactic duplication, preferred embodiments of the present invention generally provide greater reduction in the size of a given set of class files than do those of U.S. Pat. No. 5,966,702. Furthermore, because the different references to the entries in the individual constant pools of the various classes are replaced by multiple references to the same entry in the global constant pool, a still greater reduction in the size of the JAR file can be achieved when the JAR file is compressed. This enhanced compression is due to the fact that JAR files are conventionally compressed using xe2x80x9cZIP-likexe2x80x9d (Ziv-Lempel) compression algorithms, which search for and encode multiple occurrences of identical strings found in the file. These advantages of the present invention are achieved without the need for modification of the standard JVM.
There is therefore provided, in accordance with a preferred embodiment of the present invention, a method for packaging program resources, including:
collecting a set of the program resources that includes a plurality of object files, which contain data structures having entries that are constants and methods that reference the entries;
combining the data structures in at least some of the object files into a common data pool, in which semantically-identical entries in different ones of the files are represented by a single consolidated entry, irrespective of whether the entries in the different files are syntactically identical; and
packaging the set of the program resources together with the common data pool in a combined output file.
Preferably, the object files include class files, and the class files include respective constant pools containing the data structures. Most preferably, combining the data structures includes consolidating substantially all of the data structures in the constant pools of all of the object files into the common pool, and removing substantially all of the data structures from all of the object files before packaging the object files in the combined output file.
Preferably, the object files include executable code, and combining the data structures includes scanning the code to identify the entries in the different files that are semantically identical. Most preferably, scanning the code includes finding first and second ones of the entries in the different files that reference a common element in one of the files in the set, while the first and second entries themselves are different, first and second constants. Additionally or alternatively, combining the data structures includes modifying the references to the semantically-identical entries so as to refer to the consolidated entry in the common data pool.
In a preferred embodiment, packaging the set of the program resources includes adding a program mechanism to the set which, when read by a computer receiving the packaged set of program resources, causes the computer to reconstruct the data structures in the object files from the common data pool. In another preferred embodiment, packaging the set of the program resources includes compressing the resources in the output file.
There is also provided, in accordance with a preferred embodiment of the present invention, a method for generating an archive file, including:
assembling a set of program resources that include a plurality of class files containing methods and respective constant pools;
combining the constant pools of the class files into a global constant pool, in which semantically-identical entries in the constant pools of different ones of the class files are represented by a single consolidated entry, irrespective of whether the entries in the different class files are syntactically identical; and
packaging the set of the program resources together with the common data pool in the archive file.
In a preferred embodiment, the class files include Java classes, and the archive file includes a Java Archive (JAR) file. Preferably, packaging the set of the program resources includes creating the JAR file in such a manner that a standard Java Virtual Machine can, substantially without modification, open the JAR file and invoke the methods in the class files that are packaged therein.
Preferably, combining the constant pools includes consolidating substantially all of the constant pools of all of the class files into the global constant pool. Further preferably, consolidating substantially all of the constant pools includes creating the global constant pool in one of the class files, and removing the constant pools from all of the other class files before packaging the class files in the archive file. In a preferred embodiment, assembling the set of program resources includes collecting the program resources needed to run an applet, and creating the global constant pool in one of the class files includes specifying the one of the class files that is first invoked among the class files in order to run the applet, and creating the global constant pool in the specified class file.
Preferably, packaging the set of the program resources includes adding a program mechanism to the set which, when read by a computer receiving the packaged set of program resources, causes the computer to reconstruct the constant pools of the class files from the global constant pool.
There is additionally provided, in accordance with a preferred embodiment of the present invention, a method for packaging program resources, including:
collecting a set of class files containing methods and constant pools, which include data structures having entries that are constants;
consolidating the constant pools of the class files into a single, common pool, including substantially all of the data structures in all of the class files in the set; and
packaging the set of the class files, together with the common pool in a combined output file.
Preferably, consolidating the constant pools includes removing the constant pools from the class files after consolidating the constant pools in the common pool, most preferably by placing the common pool in one of the class files, so that the class files no longer contain the constant pools.
There is further provided, in accordance with a preferred embodiment of the present invention, a method for packaging program resources, including:
assembling a set of the program resources that includes a plurality of class files containing methods and constant pools, which include data structures having entries that are constants;
consolidating at least a portion of the constant pools of the class files into a single, common pool;
adding to the set of resources a program mechanism which, when read by a computer receiving the class files with the single, common pool, causes the computer to reconstruct the constant pools in the class files from the common pool; and
packaging the set of the program resources, including the common pool and the program mechanism, in a combined output file.
In a preferred embodiment, assembling the set of program resources includes collecting the program resources needed to run an application, and adding the program mechanism includes providing the mechanism so that the computer will reconstruct the constant pools during a process of installation of the application on the computer.
In another preferred embodiment, assembling the set of program resources includes collecting the program resources needed to run an applet, and adding the program mechanism includes providing the mechanism so that the computer will reconstruct the constant pools during a process of initializing the applet. Preferably, adding the mechanism includes adding the mechanism to one of the classes that is first to be loaded by the computer when it runs the applet. Further preferably, adding the mechanism includes providing an initialization method in the class that is the first to be loaded, such that the initialization method engenders reconstruction of the constant pools. Most preferably, providing the initialization method includes adding a wrapper class containing the initialization method to the program resources needed to run the applet, and configuring the set of the program resources so that the wrapper class is the first to be loaded.
In a further preferred embodiment, the class files include Java classes, and the archive file includes a Java Archive (JAR) file. Preferably, adding the mechanism includes providing the mechanism in the JAR file in such a manner as to enable a standard Java Virtual Machine, substantially without modification, to open the JAR file and invoke the methods in the class files that are packaged therein.
There is moreover provided, in accordance with a preferred embodiment of the present invention, apparatus for packaging program resources, including an archive processor, which is arranged to collect a set of the program resources that includes a plurality of object files, which contain data structures having entries that are constants and methods that reference the entries, to combine the data structures in at least some of the object files into a common data pool, in which semantically-identical entries in different ones of the files are represented by a single consolidated entry, irrespective of whether the entries in the different files are syntactically identical, and to package the set of the program resources together with the common data pool in a combined output file.
There is furthermore provided, in accordance with a preferred embodiment of the present invention, apparatus for generating an archive file, including an archive processor, which is arranged to assemble a set of program resources that include a plurality of class files containing methods and respective constant pools, to combine the constant pools of the class files into a global constant pool, in which semantically-identical entries in the constant pools of different ones of the class files are represented by a single consolidated entry, irrespective of whether the entries in the different class files are syntactically identical, and to package the set of the program resources together with the common data pool in the archive file.
There is also provided, in accordance with a preferred embodiment of the present invention, apparatus for packaging program resources, including an archive processor, which is arranged to assemble a set of class files containing methods and constant pools, which include data structures having entries that are constants, to consolidate the constant pools of the class files into a single, common pool, including substantially all of the data structures in all of the class files in the set, and to package the set of the class files, together with the common pool in a combined output file.
There is additionally provided, in accordance with a preferred embodiment of the present invention, apparatus for packaging program resources, including an archive processor, which is arranged to assemble a set of the program resources that includes a plurality of class files containing methods and constant pools, which include data structures having entries that are constants, to consolidate at least a portion of the constant pools of the class files into a single, common pool, to add to the set of resources a program mechanism which, when read by a computer receiving the class files with the single, common pool, causes the computer to reconstruct the constant pools in the class files from the common pool, and to package the set of the program resources, including the common pool and the program mechanism, in a combined output file.
There is further provided, in accordance with a preferred embodiment of the present invention, a computer program product for packaging program resources, the product including a computer-readable medium in which program instructions are stored, which instructions, when read by a computer, cause the computer to collect a set of the program resources that includes a plurality of object files, which contain data structures having entries that are constants and methods that reference the entries, to combine the data structures in at least some of the object files into a common data pool, in which semantically-identical entries in different ones of the files are represented by a single consolidated entry, irrespective of whether the entries in the different files are syntactically identical, and to package the set of the program resources together with the common data pool in a combined output file.
There is moreover provided, in accordance with a preferred embodiment of the present invention, a computer program product for generating an archive file, the product including a computer-readable medium in which program instructions are stored, which instructions, when read by a computer, cause the computer to assemble a set of program resources that include a plurality of class files containing methods and respective constant pools, to combine the constant pools of the class files into a global constant pool, in which semantically-identical entries in the constant pools of different ones of the class files are represented by a single consolidated entry, irrespective of whether the entries in the different class files are syntactically identical, and to package the set of the program resources together with the common data pool in the archive file.
There is furthermore provided, in accordance with a preferred embodiment of the present invention, a computer program product for packaging program resources, the product including a computer-readable medium in which program instructions are stored, which instructions, when read by a computer, cause the computer to assemble a set of class files containing methods and constant pools, which include data structures having entries that are constants, to consolidate the constant pools of the class files into a single, common pool, including substantially all of the data structures in all of the class files in the set, and to package the set of the class files, together with the common pool in a combined output file.
There is also provided, in accordance with a preferred embodiment of the present invention, a computer program product for packaging program resources, the product including a computer-readable medium in which program instructions are stored, which instructions, when read by a computer, cause the computer to assemble a set of the program resources that includes a plurality of class files containing methods and constant pools, which include data structures having entries that are constants, to consolidate at least a portion of the constant pools of the class files into a single, common pool, to add to the set of resources a program mechanism which, when read by a client computer receiving the class files with the single, common pool, causes the client computer to reconstruct the constant pools in the class files from the common pool, and to package the set of the program resources, including the common pool and the program mechanism, in a combined output file.
The present invention will be more fully understood from the following detailed description of the preferred embodiments thereof, taken together with the drawings in which: