This invention relates generally to processing of program files written in a high level-language, and specifically to program files written in a high-level programming language capable of dynamic loading.
As the number and type of hand-held and other electronic devices increases, there is a corresponding increase in the applications that run on and interface with these devices, as well as an increase in the desired flexibility for the user to add special programs and functionality. These devices are sometimes referred to as xe2x80x9cembedded devices,xe2x80x9d as they include a processor for executing instructions, special function units, and the instructions for providing the desired functionality. Embedded devices are typically stand alone devices, having their own power supply, but often include the capability to interface with other systems. For example, embedded devices such as cellular phones, pagers, and personal digital assistants (PDAs) typically include a central processing unit (CPU) for executing computer programs stored within the device and a battery allowing mobility. Subsequent to manufacture of an embedded device, individual users may desire to customize their device by adding special functionality or applications. It is desirable to use computer programs, or codes, written in a high-level programming language, such as Java(trademark), a language developed by Sun Microsystems, Inc., which facilitate the later installation of user-supplied applications. Java is particularly attractive, as it is platform-independent, meaning that it is not specific to one operating system or hardware configuration.
One constraint in developing code for an embedded device is the limited amount of memory, which reduces the amount of code that a device is able to store and also impacts the computing capabilities of the device. A key goal in designing code for an embedded device is then to maximize memory efficiency and speed of installed applications. Currently, several methods of increasing the memory efficiency of embedded applications exist, however, these methods do not generally extend to the subsequent installation of additional applications by the user.
For ease of illustration, the Java programming language serves as an exemplar, however, the present invention is applicable to other programming languages as well. Several terms will be used herein with reference to a system having embedded Java program(s). Memory refers to Read Only Memory (ROM), writable memory, and readable-writable memory. Readable-writable memory may be Random Access Memory (RAM), Electrically-Erasable-Programmable Memory (EEPROM), Programmable Memory (PROM), including both One-Time PROM (OTPROM) and Erasable PROM (EPROM), FLASH Memory, etc. The term xe2x80x9cdynamic memoryxe2x80x9d is used to refer to memory that is dynamically allocated and does not retain stored information or data when power is removed from the device, such as RAM. The term xe2x80x9cpermanent memoryxe2x80x9d is used to refer to memory that is treated as read-only during execution, and retains stored information or data when power is removed from the device, such as ROM.
Java in particular, is an object-oriented programming language that is portable, easy to program, and architecture-neutral. Object-oriented design focuses on the data, referred to as xe2x80x9cobjects,xe2x80x9d as well as the interfaces to the objects. The Java program is able to execute anywhere within a network including a variety of processing units and operating system architectures.
Java programs are both compiled and interpreted. Compilation is done once, where compiled Java programming code is referred to as xe2x80x9cJava ByteCodexe2x80x9d (JBC). The JBC is an intermediate language that is architecture-neutral or platform-independent. A Java interpreter parses and runs JBC instructions on a processor. Interpretation occurs each time the program is executed. A Java binary file, referred to as a class file, includes the JBC for a given program as well as supporting information, such as symbolic data. A class file, or program file, includes xe2x80x9citemsxe2x80x9d or information about the class, such as fields, methods, JBC arrays, and a symbolic reference table. Specifically, a Java program is composed of one or a set of Java files, which, on compilation, produce one or a set of class files.
JBC is effectively the machine code instructions for a xe2x80x9cJava Virtual Machinexe2x80x9d (Java VM). Every Java interpreter, such as a Java development tool or a Java-capable web browser uses an implementation of the Java VM. Often, these tools will either use the Java VM already installed on a system, or may come bundled with a Java VM. Note that the Java VM may also be implemented in hardware. In this way, the program may be compiled on any machine with a Java compiler and the resulting JBC may run on any implementation of the Java VM.
In order to make applications written in Java portable, much symbolic information is maintained. During normal Java VM execution of the JBC, the symbolic data is used by the Java VM to perform the dynamic binding whereby the actual pointer to the referenced structure is obtained. For example, each reference to a function is represented by the symbolic information: class name; function name; and signature. The class name identifies the class object containing the declaration of the method. The methods identify the various functions available for that class, and the JBC arrays are programs executed to implement a method. The function name, together with the signature, identifies the given function within its class. The signature describes the member and type of arguments passed to and returned by a function. The symbolic information expands the size of the Java binary file which creates memory storage problems. During execution, two (2) copies of the JBC and the symbolic information are maintained: a first copy is stored in permanent memory; and a second copy is stored in dynamic memory in a format easily manipulated by the Java VM. For small embedded devices, such as pagers, cellular phones, and PDAs, dynamic memory is very limited. It is, therefore, desirable to reduce dynamic memory usage. An additional problem is latency during execution due to the use of costly table lookups for handling symbolic references.
To address some of these problems, tools allow for compacting and formatting of Java class files to more efficiently use memory. xe2x80x9cPre-internalizationxe2x80x9d is a process of reformatting Java class file information into a format that, when placed in memory, represents a class object. Internalization occurs during loading of a class file and is the process of extracting the class information from a class file and storing the information in structure(s) in dynamic memory. The pre-internalization process eliminates symbolic references and class loading, reducing dynamic memory storage requirements. The format of the pre-internalized file is specific to each Java VM implementation.
Pre-internalization occurs after compilation but prior to normal loading and execution of the JBC. Pre-internalized class objects are restructured to eliminate the need to store them in dynamic memory, and are maintained in permanent memory. This frees more of the dynamic memory for creation of dynamic objects during execution. Information and structures used to maintain state preservation, as well as dynamic objects, are stored in what is referred to as the xe2x80x9cJava heapxe2x80x9d. A problem exists with pre-internalization as storing class information in permanent memory eliminates the ability to update this information during execution.
Current solutions avoid storing the symbolic information in dynamic memory by requiring that all symbolic references be resolved prior to class installation on the target device. Resolving references involves replacing the A reference with the location of the referenced item, i.e. an address. A problem exists in pre-internalizing a set of class files, or xe2x80x9cclass file unit,xe2x80x9d where a reference is made to classes already installed on the device and for which the location of the referenced item is either unknown or unreliable. To avoid duplicating the referenced class information, the installed classes are removed from the device and repackaged with the set of new files.
As discussed herein above, each set of class files has an associated table of symbolic references, referred to as a xe2x80x9cConstant Poolxe2x80x9d (CP) or a symbol table. The CP of the class file unit is a composite of the individual CPs for each class in the class file unit. The CP of a class file unit may be referred to as a xe2x80x9cshared CP.xe2x80x9d A JBC instruction, part of a JBC array, may include a reference, in the form of an index, to an entry in the shared CP of the class file unit. Once a class is pre-internalized, the CP contains the item address in a corresponding location.
FIG. 1 illustrates a prior art program file format. As illustrated, program file unit 2, includes two (2) program files labeled xe2x80x9cClass 1xe2x80x9d and xe2x80x9cClass 2,xe2x80x9d respectively, and a shared symbol table, shared CP 4. Each program file, Class 1 and Class 2, has associated elements including fields, methods, and binary code. The binary code is the JBC array or bytecode array. Within each program file, Class 1 and Class 2, the methods are mapped to specific JBC arrays which implement the corresponding method. A JBC array then includes reference(s) to locations within shared CP 4. These references from the JBC array to the shared CP 4 are symbolic references. For clarity in the Figures, symbolic references are indicated by dashed lines with directional arrows, while actual locations are identified by solid lines with directional arrows. Each program file, Class 1 and Class 2, has an actual reference to the address of the shared CP 4. Similarly, entries in the CP 4 have actual references to the location of methods and fields in each program file. A first entry in the shared CP 4 is typically reserved, indicated in FIG. 1 by the hatched lines.
As illustrated, one JBC array may include references to multiple locations in CP 4, by way of an index. The Java VM takes the entry of the indexed location in the shared CP and identifies the access address. The entries in CP 4 identify fields, methods, or other program files within program file unit 2. Program file units may include any number of program files, which may be grouped according to a variety of schemes, typically to facilitate a given application.
FIG. 2 illustrates a prior art method of processing program files, also referred to as class files. Program files 8, labeled xe2x80x9cClass 1,xe2x80x9d xe2x80x9cClass 2,xe2x80x9d and xe2x80x9cClass 3,xe2x80x9d are each similar to the program files of FIG. 1. Preprocessor 10 loads the program files 8 and generates formatted class file information 12. The formatting converts each program file, Class 1, Class 2, and Class 3, to class objects for use during execution, where the class objects are specific to the JVM implementation used. Preprocessor 10 is a tool that may be implemented in software or hardware. The formatted class file information 12 is structured for storage in a target device, where the device has both dynamic and permanent memory. The compiler and linker 16 combines the formatted class file information 12 with the Java VM source code 14. The Java VM source code 14 is specific to the Java VM implemented in the target device. The output of the compiler and linker 16 is the Java VM image 18, which has two portions: a first stores the Java VM 19; and a second stores preloaded class information 20. The preloaded class information 20 is the compiled version of the formatted class file information.
At this point, the Java VM image 18 is stored in device memory 22 of the target device. In this case, the device memory 22 includes dynamic memory 26, and permanent memory 24. In one embodiment, the dynamic memory 26 is implemented with RAM and the permanent memory 24 is implemented with ROM and/or FLASH memory. The dynamic memory 26 is used during execution for storing interim values and variables. The permanent memory 24 includes multiple portions: a first portion 28 for storing a class loader; a second portion 30 for storing a JBC interpreter; and a third portion 32 for storing the preloaded class information 20. The class loader of portion 28 is used for formatting binary class information for use by the Java VM, and is not required for preloaded program files 20, as they were formatted during preprocessing and compilation. The Java VM image 18 stored in the device memory 22 is used to run the Java program files, Class 1, Class 2, Class 3 in program file unit 8. The device memory 22 is then implemented within a device, such as a hand-held device or other application.
It is often desirable to add additional program files to the Java VM image in the device memory 22. Often the program file unit 8 describes basic functionality for an application, but the user is free to add supplemental functionality, such as user-specified or application-specific additions that enhance the device. There are several ways to add program files. In a first prior art method, the process 6 is performed again with the additional program files included with the program file unit 8. The resultant Java VM image 18 then includes the preloaded class information for all the program files, those in program file unit 8 plus the additional ones. This method is not flexible as it requires either the user to return to the manufacturer to have the new program files included in the processing, or the manufacturer to provide the program unit file 8 to the user and allow the user to perform the processing. According to a second prior art method, illustrated in FIG. 3, additional program files 42, including program files labeled xe2x80x9cClass 4, xe2x80x9cClass 5,xe2x80x9d and xe2x80x9cClass 6,xe2x80x9d are stored in additional permanent memory 40, such as FLASH memory. The program files 42 are then loaded by the class loader stored in portion 28 of permanent memory 24 into dynamic memory 26 on execution. The need to store the loaded program files, Class 4, Class 5, and Class 6, in dynamic memory 26 reduces the space available for the Java heap. The loss of dynamic memory space creates a situation where even if all the program files 42 are stored in dynamic memory 26, the remaining available dynamic memory space is insufficient to store variables during execution, i.e the program cannot execute. Since dynamic memory space is typically limited, maximizing the amount of space available for the Java heap is crucial in many applications.
A need therefore exists for a method of processing program files that allows the addition of user-specific program files to basic functional program files without reducing the amount of dynamic memory available for execution.
One embodiment of the present invention relates to a method for processing program files in a programming language capable of dynamic loading. The method includes receiving a first and a second program file, and generating a first program file unit having a first shared symbol table, a first formatted program file corresponding to the first program file, and a second formatted program file corresponding to the second program file. The first shared symbol table include references internal to the first program file unit, each element of the first shared symbol table has a corresponding index, and each of the first and second formatted program files include at least one reference to the first shared symbol table. The method further includes generating a first mapping mechanism, wherein the first mapping mechanism includes symbolic information corresponding to at least a portion of the indices of the first shared symbol table. The method further includes receiving a third program file, and generating a second program file unit having a second shared symbol table and a third formatted program file corresponding to the third program file, where the second program file unit is generated after the first program file unit is generated. The second shared symbol table is separate from the first shared symbol table, the second shared symbol table includes references internal to the second program file unit, each element of the second shared symbol table has a corresponding index, and the third formatted program file includes at least one internal reference to the second shared symbol table and at least one external reference to the first shared symbol table.
Another embodiment of the present invention relates to a preprocessor capable of processing program files in a programming language capable of dynamic loading including a first plurality of instructions for receiving a first and a second program file and a second plurality of instructions for generating a first program file unit having as first shared symbol table, a first formatted program file corresponding to the first program file, and a second formatted program file corresponding to the second program file. The first shared symbol table includes references internal to the first program file unit, each element of the first shared symbol table has a corresponding index, and each of the first and second formatted program files include references to the first shared symbol table. The preprocessor further includes a third plurality of instructions for generating a first mapping mechanism, where the first mapping mechanism includes symbolic information corresponding to at least a portion of the indices or the first shared symbol table, a fourth plurality of instructions for receiving a third program file, and a fifth plurality of instructions for generating a second program file unit having a second shared symbol table and a third formatted program file corresponding to the third program file, where the second program file unit is generated after the first program file unit is generated. The second shared symbol table is separate from the first shared symbol table, the second shared symbol table includes references internal to the second program file unit, each element of the second shared symbol table has a corresponding index, and the third formatted program file includes at least one internal reference to the second shared symbol table and at least one external reference to the first shared symbol table.
Yet another embodiment relates to a method for processing program files in a programming language capable of dynamic loading including receiving a first and a second program file and generating a first program file unit having a first shared symbol table, a first formatted program file corresponding to the first program file, and a second formatted program file corresponding to the second program file. The first shared symbol table includes references internal to the first program file unit, each element of the first shared symbol table has a corresponding index, and each of the first and second formatted program files include at least one reference to the first shared symbol table. The method further includes generating a first mapping mechanism, where the first mapping mechanism includes symbolic information corresponding to at least a portion of the indices of the first shared symbol table, receiving a third program file, and generating a second program file unit having a second shared symbol table and a third formatted program file corresponding to the third program file. The second shared symbol table includes references internal to the second program file unit, each element of the second shared symbol table has a corresponding index, and the third formatted program file includes at least one internal reference to the second shared symbol table and at least one external reference to the first shared symbol table. The method further includes storing the first program file unit in a permanent memory of a semiconductor device prior to generating the second program file unit.