1. Field of the Invention
This invention relates computer programs and, more specifically, to minimizing internal structures at program runtime.
Portions of the disclosure of this patent document may contain material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office file or records, but otherwise reserves all copyright rights whatsoever. Sun, Sun Microsystems, the Sun logo, Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc.
2. Background Art
Computer systems comprise resources that are used to execute computer programs such as memory and processor (or processing) time. A computer program""s code (or instruction set) is typically copied into the computer systems memory, or other storage, before it is executed thereby using processor time. In addition to the computer program""s code, the computer system""s storage may also be used to retain information about the state of the computer program during execution. This information adds additional overhead to the execution as it is necessary to allocate memory to store the information and to use other resources such as processor time to manage the information.
One example of program information that is stored during program execution are symbol, field and method tables. These tables are examples of internal structures that use space in memory during program execution. Further, processor time is used to maintain the information contained in the internal structures. It would be beneficial to be able to reduce the internal structures that are used during execution thereby reducing memory and processor time during program execution.
The problems associated with the use of internal structures during program execution can be better understood from a review of a virtual machine""s processing environment and an overview of object-oriented programming.
Object-oriented programming is a method of creating computer programs by combining certain fundamental building blocks, and creating relationships among and between the building blocks. The building blocks in object-oriented programming systems are called xe2x80x9cobjects.xe2x80x9d A software application can be written using an object-oriented programming language whereby the program""s functionality is implemented using these objects.
An object is a programming unit that groups together a data structure (one or more instance variables) and the operations (methods) that can use or affect that data. Thus, an object consists of data and one or more operations or procedures that can be performed on that data. The joining of data and operations into a unitary building block is called xe2x80x9cencapsulation.xe2x80x9d
An object can be instructed to perform one of its methods when it receives a xe2x80x9cmessage.xe2x80x9d A message is a command or instruction sent to the object to execute a certain method. A message consists of a method selection (e.g., method name) and a plurality of arguments. A message tells the receiving object what operations to perform.
One advantage of object-oriented programming is the way in which methods are invoked. When a message is sent to an object, it is not necessary for the message to instruct the object how to perform a certain method. It is only necessary to request that the object execute the method. This greatly simplifies program development.
Object-oriented programming languages are predominantly based on a xe2x80x9cclassxe2x80x9d scheme. The class-based object-oriented programming scheme is generally described in Lieberman, xe2x80x9cUsing Prototypical Objects to Implement Shared Behavior in Object-Oriented Systems,xe2x80x9d OOPSLA 86 Proceedings, September 1986, pp. 214-223.
An object class provides a definition for an object which typically includes both variables and methods. An object class is used to create a particular object xe2x80x9cinstance.xe2x80x9d (The term xe2x80x9cobjectxe2x80x9d by itself is often used interchangeably to refer to a particular class or a particular instance.) An instance of an object class includes the variables and methods defined for the class. Multiple instances can be created from the same object class. Each instance that is created from the object class is said to be of the same type or class.
To illustrate, an employee object class can include xe2x80x9cnamexe2x80x9d and xe2x80x9csalaryxe2x80x9d instance variables and a xe2x80x9cset_salaryxe2x80x9d method. Instances of the employee object class can be created, or instantiated for each employee in an organization. Each object instance is said to be of type xe2x80x9cemployee.xe2x80x9d Each employee object instance includes xe2x80x9cnamexe2x80x9d and xe2x80x9csalaryxe2x80x9d instance variables and the xe2x80x9cset_salaryxe2x80x9d method. The values associated with the xe2x80x9cnamexe2x80x9d and xe2x80x9csalaryxe2x80x9d variables in each employee object instance contain the name and salary of an employee in the organization. A message can be sent to an employee""s employee object instance to invoke the xe2x80x9cset_salaryxe2x80x9d method to modify the employee""s salary (i.e., the value associated with the xe2x80x9csalaryxe2x80x9d variable in the employee""s employee object).
A hierarchy of classes can be defined such that an object class definition has one or more subclasses. A subclass inherits its parent""s (and grandparent""s etc.) definition. Each subclass in the hierarchy may add to or modify the behavior specified by its parent class. Some object-oriented programming languages support multiple inheritance where a subclass may inherit a class definition from more than one parent class. Other programming languages support only single inheritance, where a subclass is limited to inheriting the class definition of only one parent class. The Java programming language also provides a mechanism known as an xe2x80x9cinterfacexe2x80x9d which comprises a set of constant and abstract method declarations. An object class can implement the abstract methods defined in an interface. Both single and multiple inheritance are available to an interface. That is, an interface can inherit an interface definition from more than one parent interface.
Object-oriented software applications (e.g., applications written using the Java programming language) typically comprise one or more object classes and interfaces. Many programming languages can be used to write a program which is compiled into machine-dependent (or platform-dependent), executable code. However, in other languages such as the Java programming language, program code (e.g., classes) may be compiled into platform-independent bytecode class files. Each class contains code and data in a platform-independent format. A bytecode includes a code that identifies an instruction (an opcode) and none or more operands to be used in executing the instruction. The computer system acting as the execution vehicle contains a program such as a virtual machine, which is responsible for executing the platform-independent code (e.g., bytecodes generated from a program written using the Java programming language).
Platform-independent programs have an advantage of being usable on multiple platforms. There is no need to develop program code for multiple platforms. The same platform-independent program can be executed on multiple platforms using a virtual machine or other mechanism that is configured to translate the platform-independent code into platform-dependent code. Thus, an application developer can develop one version of an application""s program code that can ultimately be executed on multiple platforms, for example.
Applications may be designed as standalone applications, or as xe2x80x9cappletsxe2x80x9d which are identified by an applet tag in an HTML (Hypertext Markup Language) document, and loaded by a browser application. The class files associated with an application or applet may be stored on the local computing system, or on a server accessible over a network. Each class file is loaded into the virtual machine, as needed, by the xe2x80x9cclass loader.xe2x80x9d
The classes of an applet are loaded on demand from the network (stored on a server), or from a local file system, when first referenced during the applet""s execution. The virtual machine locates and loads each class file, parses the class file format, allocates memory for the class""s various components, and links the class with other already loaded classes. This process makes the code in the class readily executable by the virtual machine. Native code, e.g., in the form of a linked library, is loaded when a class file containing the associated native method is instantiated within the virtual machine.
FIG. 1 illustrates the development and runtime environments for a processing system. In the development environment, a software developer creates source files 100 written using a programming language such as the Java programming language, which contain the programmer readable class definitions, including data structures, method implementations and references to other classes. Class source files 100 are provided to compiler 101, which compiles class source files 100 into compiled xe2x80x9c.classxe2x80x9d (or class) files 102 that contain bytecodes executable by a virtual machine. Class files 102 are stored (e.g., in temporary or permanent storage) on a server, and are available for download over a network. Alternatively, class files 102 may be stored locally in a directory on the client platform.
The runtime environment contains a virtual machine (VM) 105 which is able to execute bytecode class files and execute native operating system (xe2x80x9cO/Sxe2x80x9d) calls to operating system 109 when necessary during execution. Virtual machine 105 provides a level of abstraction between the machine independence of the bytecode classes and the machine-dependent instruction set of the underlying computer hardware 110, as well as the platform-dependent calls of operating system 109.
Class loader and bytecode verifier (xe2x80x9cclass loaderxe2x80x9d) 103 is responsible for loading bytecode class files 102 and supporting class libraries 104 into virtual machine 105 as needed. Class loader 103 also verifies the bytecodes of each class file to maintain proper execution and enforcement of security rules. Within the context of runtime system 108, either an interpreter 106 executes the bytecodes directly, or a xe2x80x9cjust-in-timexe2x80x9d (JIT) compiler 107 translates the bytecodes into machine code, so that they can be executed by the processor (or processors) in hardware 110.
Linked library 111 can be, for example, a xe2x80x9cshared objectxe2x80x9d library in the Solaris(trademark) or UNIX environment that is written as a xe2x80x9c.soxe2x80x9d file, or linked library 111 may take the form of a dynamic linked (or loadable) library written as a xe2x80x9c.dllxe2x80x9d file in a Windows environment. Native code (e.g., in the form of linked library 111) is loaded when a class containing the associated native method is instantiated within the virtual machine, or by invoking a xe2x80x9cload libraryxe2x80x9d method, for example.
Interpreter 106 reads, interprets and executes a bytecode instruction before continuing on to the next instruction. JIT compiler 107 can translate multiple bytecode instructions into machine code that are then executed. Compiling the bytecodes prior to execution results in faster execution. If, for example, the same bytecode instruction is executed multiple times in a program""s execution, it must be interpreted each time it is executed using interpreter 106. If JIT compiler 107 is used to compile the program, the bytecode instruction may be translated once regardless of the number of times it is executed in the program. Further, if the compilation (i.e., output of JIT compiler 107) is retained, there is no need to translate each instruction during program execution.
The runtime system 108 of virtual machine 105 supports a general stack architecture. The manner in which this general stack architecture is supported by the underlying hardware 110 is determined by the particular virtual machine implementation, and reflected in the way the bytecodes are interpreted or JIT-compiled.
FIG. 2 illustrates runtime data areas which support the stack architecture within runtime system 108. In FIG. 2, runtime data areas 200 comprise one or more thread-based data areas 207. Each thread-based data area 207 comprises a program counter register (PC REG) 208, a local variables pointer register (VARS REG) 209, a frame register (FRAME REG) 210, an operand stack pointer register (OPTOP REG) 211, a stack 212 and, optionally, a native method stack 216. Stack 212 comprises one or more frames 213 which contain an operand stack 214 and local variables 215. Native method stack 216 comprises one or more native method frames 217.
A virtual machine can support many threads of execution at once. At any point, each thread is executing the code of a single method, the xe2x80x9ccurrent methodxe2x80x9d for that thread. If the xe2x80x9ccurrent methodxe2x80x9d is not a native method, program counter register 208 contains the address of the virtual machine instruction currently being executed. If the xe2x80x9ccurrent methodxe2x80x9d is a native method, the value of program counter register 208 is undefined. Frame register 210 points to the location of the current execution frame (containing the method parameters and local variables) in the execution stack 212 of thread 1.
Runtime data areas 200 are located in heap 201. Heap 201 is the runtime data area from which memory for all classes, class instances and arrays is allocated. Heap 201 includes class area 202 which contains one or more class-based data areas 203 for storing information extracted from each loaded class file. For example, class-based data area 203 may comprise class structures such as constant pool 204, field and method data 205, and code for methods and constructors 206.
Constant pool 204, field and method data 205 and code for methods and constructors 206 are examples of internal structures that are stored in memory for a class when it is loaded into the runtime of the virtual machine. Class area 202 is created when virtual machine 105 initializes. When a new class is loaded, memory is allocated for the internal structures associated with the class (e.g., constant pool 204, field and method data 205 and code for methods and constructors 206).
Typically, only one internal class structure is created for an object class. However, in some circumstances, such as instantiating an array, more than one internal class structure is created. An array is a structure that contains none or more elements or components of the same type (e.g., integer, floating point, short, long, etc.). The component type is often referred to as the element type of the array. Thus, for example, where the array is comprised of components of type integer, the array is said to be an integer array. The element type of the array can also be an object (class) such as a BankAccount object (that includes properties and behavior for a bank account) or a Thread object (that includes properties and behavior for a thread). In this case, an array is associated with two object classes: an array object class and an element type class.
In object-oriented programming languages, all objects must be instances of some class. For instance, all bank account objects are typically instances of class xe2x80x9cAccountxe2x80x9d. The same is true also of arrays, i.e., they must be instances of classes too. Since arrays do not usually have any corresponding source-level class definition, an internal array class is created automatically by the system. The array class stores the information that is shared between all the array instances of the same type, including the element type information.
In general, each array instance in an object-oriented program is associated with two class structures: the internal array class storing the shared information between array instances, and the element type class defining the type of the elements that can be stored in the array. Since the element type is the same for all the arrays of the same type, the element type information is commonly stored in the array class. FIG. 3A provides an illustration of internal structures created for two different types of arrays.
For example, an array of xe2x80x9cAccountsxe2x80x9d results in the creation of internal class structure 303B for the array of Accounts object 306 and the creation of internal class structure 303A for the. xe2x80x9cAccountxe2x80x9d element type object class 304 in memory area 302. Similarly, an array of xe2x80x9cThreadsxe2x80x9d has two internal structures in memory area 302 (i.e., internal class structure 303C for the xe2x80x9cThreadxe2x80x9d element type object class and internal class structure 303D for the array of Threads).
The creation of the necessary internal structures for arrays usually takes place when an array is declared in program code. At that point, the element class will be loaded (if it has not already been loaded), the internal array class will be created, and an array instance referring to the array class is allocated.
It is important to notice that the role of an array object class is limited to such tasks as ensuring that there is no type mismatch (e.g., assigning data from an array of one type to an array of another type). In most programming languages, there is no ability to add new methods to the array object class. That is, the array object class cannot be modified to include functionality in addition to its type checking function. For this reason, the only really relevant information in array classes is the element type information. Otherwise, array classes are rather redundant structures.
FIG. 3B provides a more detailed illustration of the runtime state of a virtual machine as a result of an array instantiation. Memory is allocated to store the internal structures for array object class 322, parent object class 324 ( e.g., xe2x80x9cjava.lang.Objectxe2x80x9d in the Java programming language), element type class 326 and class representation class 328 (e.g., xe2x80x9cjava.lang.Classxe2x80x9d in the Java programming language). Array instance 320 is created as an instance of array object class 322 which is created by VM 105. Array instance 320 includes a field, class pointer 330, that points to array object class 322. Array instance 320 may include a locking field that is used to synchronize write operations, for example. A length field 334 identifies the number of elements in array instance 320. Elements 336 represents the elements of the array. Elements 336 can contain pointers to instances of element type class 326 that contain the actual values of the array.
Array object class 322 includes a pointer 342 that identifies the element type for array instances that are instantiated from array object class 322. That is, pointer 342 identifies the element type of array instance 320. Note that in some implementations of the VM the reference to the element type class is not necessarily a physical pointer; rather, the name of the array class can be used to indirectly indicate the element type. This is because the name of the array class is always the same as the name of the element type, except that the name has been augmented with the prefix xe2x80x9c[xe2x80x9d (see Lindholm, T. and Yellin, F., The Java Virtual Machine Specification, Addison-Wesley 1996, p. 146). In addition, array object class 322 may contain two other pointers 338 and 340. Pointer 338 refers to the parent class 344 (e.g., the xe2x80x9cjava.lang.Objectxe2x80x9d) that is the common superclass of all array classes in the Java programming language. Pointer 340 points to the class representation class 328 (e.g., xe2x80x9cjava.lang.Classxe2x80x9d). All classes in the Java programming language, including the array classes, are instances of the class representation class. For example, element type class 326 and class representation class 328 are instances of, and contain pointers to (i.e., pointers 346 and 348), class representation class 328.
As explained earlier, the main role of the array classes in the Java programming language and many other object-oriented programming systems is to serve as placeholders for shared element type information. Unlike regular classes, array classes do not contain any method or variable definitions, and therefore their constant pools, method tables and field tables (see FIG. 3A) are typically empty. In general, except for their role in type checking, array classes are rather redundant. They add overhead to programs due to memory used and the time needed to allocate and deallocate memory, for example. Some application programs use a number of arrays with different element types, or multidimensional arrays (i.e., arrays of arrays), and in such situations a separate array class has to be created for each type of array.
Memory usage can be very critical if only a limited amount of memory is available. For example, an embedded system such as a smart card or a personal organizer device has a limited amount of memory. It would be beneficial to reduce the number of internal structures needed, thereby reducing memory and power consumption.
Embodiments of the invention comprise a method and apparatus for avoiding array object class creation in, for example, platform-independent virtual machines for object-oriented programming languages. Embodiments of the invention reduce the internal structures created for arrays at runtime, thereby reducing memory consumption.
Unlike in a traditional implementation, where a separate array class is created for each array of different type, in an embodiment of the invention the type information is stored in array instances instead. Array classes are not created at all. Rather, the root class of the class hierarchy (e.g., xe2x80x9cjava.lang.Objectxe2x80x9d) is used as the class of each array instance. When an array instance is instantiated, a reference to the root class is created in the class field of the array instance and the type information is stored in the instance itself.
The type information contained in the array instance can be used to identify the type of the array. In an embodiment that uses an integer type value, the value either maps to a known type or an unknown type. Where an unknown type is indicated by the value that is retrieved from the array instance, a method is provided for accessing an element of the array to determine the type of the array instance.
In an embodiment that uses a reference type field, the type field can either contain an integer type value for known types, or a reference to an element type object where the type is an unknown type. A method is provided for accessing the type field to determine whether the type information is an integer type of an object reference.