1. Field of the Invention
This invention relates to the field of computer systems, and, more specifically, to virtual machine runtime environments.
Solaris, Sun, Sun Microsystems, the Sun logo, Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries.
2. Background Art
The Java.TM. programming language, developed by Sun Microsystems.RTM., has an advantage over other programming languages of being a "write once, run anywhere".TM. language. The Java programming language provides a substantially platform-independent mechanism for applications, or "applets," to be designed, distributed and executed in the form of bytecode class files. The Java virtual machine handles the resolution of the bytecodes into the requisite platform dependent instruction set, so that all computing platforms which contain a Java virtual machine are capable of executing the same bytecode class files. When functions are needed which are not supported by the Java programming language, a Java application executing within the virtual machine may invoke native code functions implemented in linked libraries. Native code is not subject to Java programming and execution restrictions, thus providing more platform-specific programmability at the cost of less well-controlled execution behavior. A processing environment for Java applications and applets, and the use of native code, are described more fully below.
The Processing Environment
The Java programming language is an object-oriented programming language with each program comprising one or more object classes and interfaces. Unlike many programming languages in which a program is compiled into machine-dependent, executable program code, classes written in the Java programming language are compiled into machine independent bytecode class files. Each class contains code and data in a platform-independent format called the class file format. The computer system acting as the execution vehicle contains a program called a virtual machine, which is responsible for executing the code in each class.
Applications may be designed as standalone Java applications, or as Java "applets" which are identified by an applet tag in an HTML (hypertext markup language) document, and loaded by a browser application. The class files associated with an application or applet may be stored on the local computing system, or on a server accessible over a network. Each class is loaded into the Java virtual machine, as needed, by the "class loader."
To provide a client with access to class files from a server on a network, a web server application is executed on the server to respond to HTTP (hypertext transport protocol) requests containing URLs (universal resource locators) to HTML documents, also referred to as "web pages." When a browser application executing on a client platform receives an HTML document (e.g., as a result of requesting an HTML document by forwarding a URL to the web server), the browser application parses the HTML and automatically initiates the download of the specified bytecode class files when it encounters an applet tag in the HTML document.
The classes of a Java applet are loaded on demand from the network (stored on a server), or from a local file system, when first referenced during the Java applet's execution. The virtual machine locates and loads each class file, parses the class file format, allocates memory for the class's various components, and links the class with other already loaded classes. This process makes the code in the class readily executable by the virtual machine.
Java applications and applets often make use of class libraries. Classes in the class libraries may contain what are referred to as "native methods." Applications and applets may occasionally contain classes that have native methods as well. A native method specifies the keyword "native," the name of the method, the return type of the method, and any parameters that are passed to the method. In contrast to a "standard method" (i.e., non-native method) written in the Java programming language, there is no body to a native method within the respective class. Rather, the routines of a native method are carried out by compiled native code (e.g., code written in the C or C++ programming language and compiled into binary form) that is dynamically linked to a given class in the virtual machine at runtime using a linking facility specific to the given platform which supports linked libraries.
In the Solaris.TM. or UNIX environment, for example, the linked library containing the binary form of the native code may be implemented as a "shared object" library written as a ".so" file. In a Windows environment, the linked library may take the form of a dynamic linked (or dynamic loadable) library written as a ".dll" file. Native code may be used to perform functions otherwise not supported by the Java programming language, such as interfacing with specialized hardware (e.g., display hardware) or software (e.g., database drivers) of a given platform. Native code may also be used to speed up computationally intensive functions, such as rendering.
A class that contains a native method also contains a call to load the respective linked library: EQU System.loadLibrary("Sample");
where "Sample" is the name of the linked library, typically stored in a file named "libSample.so" or "Sample.dll", depending on the host operating system (e.g., UNIX, Windows, etc.). The linked library is typically loaded at the time the associated class is instantiated within the virtual machine.
The linked library of native code is compiled with stub and header information of the associated class to enable the linked library to recognize the method signature of the native method in the class. The implementation of the native method is then provided as a native code function (such as a C function) in the linked library. At runtime, when a call is made to the native method, control is passed to the function in the linked library that corresponds to the called method (e.g., via pushing of a native method frame onto the native method stack). The native code within the linked library performs the function and passes control back to the Java application or applet.
FIG. 1 illustrates the compile and runtime environments for a processing system. In the compile environment, a software developer creates source files 100 (e.g., in the Java programming language), which contain the programmer readable class definitions, including data structures, method implementations and references to other classes. Source files 100 are provided to Java compiler 101, which compiles source files 100 into compiled ".class" files 102 that contain bytecodes executable by a Java virtual machine. Bytecode class files 102 are stored (e.g., in temporary or permanent storage) on a server, and are available for download over a network. Alternatively, bytecode class files 102 may be stored locally in a directory on the client platform.
The Java runtime environment contains a Java virtual machine (JVM) 105 which is able to execute bytecode class files and execute native operating system ("O/S") calls to operating system 109 when necessary during execution. Java virtual machine 105 provides a level of abstraction between the machine independence of the bytecode classes and the machine-dependent instruction set of the underlying computer hardware 110, as well as the platform-dependent calls of operating system 109.
Class loader and bytecode verifier ("class loader") 103 is responsible for loading bytecode class files 102 and supporting class libraries 104 into Java virtual machine 105 as needed. Class loader 103 also verifies the bytecodes of each class file to maintain proper execution and enforcement of security rules. Within the context of runtime system 108, either an interpreter 106 executes the bytecodes directly, or a "just-in-time" (JIT) compiler 107 transforms the bytecodes into machine code, so that they can be executed by the processor (or processors) in hardware 110. Native code, e.g., in the form of a linked library 111, is loaded when a class (e.g., from class libraries 104) containing the associated native method is instantiated within the virtual machine.
The runtime system 108 of virtual machine 105 supports a general stack architecture. The manner in which this general stack architecture is supported by the underlying hardware 110 is determined by the particular virtual machine implementation, and reflected in the way the bytecodes are interpreted or JIT-compiled. Other elements of the runtime system include thread management (e.g., scheduling) and garbage collection mechanisms.
FIG. 2 illustrates runtime data areas which support the stack architecture within runtime system 108. In FIG. 2, runtime data areas 200 comprise one or more thread-based data areas 207. Each thread-based data area 207 comprises a program counter register (PC REG) 208, a local variables pointer register (VARS REG) 209, a frame register (FRAME REG) 210, an operand stack pointer register (OPTOP REG) 211, a stack 212 (e.g., for standard methods) and, optionally, a native method stack 216. Stack 212 comprises one or more frames 213 which contain an operand stack 214 and local variables 215. Native method stack 216 comprises one or more native method frames 217.
Runtime data areas 200 further comprises shared heap 201. Heap 201 is the runtime data area from which memory for all class instances and arrays is allocated. Shared heap 201 comprises method area 202, which is shared among all threads. Method area 202 comprises one or more class-based data areas 203 for storing information extracted from each loaded class file. For example, class-based data area 203 may comprise class structures such as constant pool 204, field and method data 205, and code for methods and constructors 206.
A virtual machine can support many threads of execution at once. Each thread has its own thread-based data area 207. At any point, each thread is executing the code of a single method, the "current method" for that thread. If the "current method" is not a native method, program counter register 208 contains the address of the virtual machine instruction currently being executed. If the "current method" is a native method, the value of program counter register 208 is undefined. Frame register 210 points to the location of the current method in method area 202.
Each thread has a private stack 212, created at the same time as the thread. Stack 212 stores one or more frames 213 associated with standard methods invoked by the thread. Frames 213 are used to store data and partial results, as well as to perform dynamic linking, return values for methods and dispatch exceptions. A new frame is created and pushed onto the stack each time a standard method is invoked, and an existing frame is popped from the stack and destroyed when its method completes. A frame that is created by a thread is local to that thread and typically cannot be directly referenced by any other thread.
Only one frame, the frame for the currently executing method, is active at any point in a given thread of control. This frame is referred to as the "current frame," and its method is known as the "current method." A frame ceases to be current if its method invokes another method or if its method completes. When a method is invoked, a new frame is created and becomes current when control transfers to the new method. On method return, the current frame passes back the results of its method invocation, if any, to the previous frame. The current frame is then discarded while the previous frame becomes the current one.
Each frame 213 has its own set of local variables 215 and its own operand stack 214. The local variables pointer register 209 contains a pointer to the base of an array of words containing local variables 215 of the current frame. The operand stack pointer register 211 points to the top of operand stack 214 of the current frame. Most virtual machine instructions take values from the operand stack of the current frame, operate on them, and return results to the same operand stack. Operand stack 214 is also used to pass arguments to methods and receive method results.
Native method stack 216 stores native method frames 217 in support of native methods. Each native method frame provides a mechanism for thread execution control, method arguments and method results to be passed between standard methods and native methods implemented as native code functions in a linked library.
Because native methods are implemented by native code within a linked library rather than as a standard method in a class, native methods are not subject to the restrictions imposed by the Java programming language and the bytecode verifier. This means that, unlike bytecodes for compiled Java applications and applets, native code in a linked library may be prone to undesired and illegal behavior that proceeds unchecked at runtime. For example, memory access errors may take place in the native code due to the occurrence of "wild" pointers (e.g., a pointer whose value exceeds a proscribed range, such as a pointer to the ninth element of an eight element array) and the use of memory access mechanisms that may address inappropriate (i.e., restricted or out-of-bounds) memory locations. The use of native methods therefore makes possible a range of programming bugs, mostly based on the use of pointers, that make debugging a particular virtual machine implementation more difficult.
Further, the native code may include blocking system calls (e.g., calls that may wait an unspecified length of time for an external event to occur). If a virtual machine implements its own thread management and scheduling, a blocking system call occurring when control has been passed to a native code function in a linked library can block the execution of the entire virtual machine.
Most virtual machine implementations avoid the blocking problems associated with native code by using "native threading." This means that multiple threads of the virtual machine and the program or programs (e.g, applications and/or applets) the virtual machine is executing are implemented as threads of the underlying platform, e.g., as UNIX threads. In this scheme, the threads of the virtual machine may execute concurrently. However, if native threading is used, the virtual machine must cede control over thread scheduling to the underlying operating system. Native threading thus causes thread behavior to be operating system and hardware-dependent. Effective debugging of concurrency-related bugs in a virtual machine implementation becomes problematic because, with native threading, the relative timing of thread execution may vary across different operating systems and hardware platforms.
FIGS. 3A and 3B are block diagrams that illustrate thread use in runtime environments. FIG. 3A contains a virtual machine that does not use native threading. FIG. 3B contains a virtual machine that does use native threading.
In FIG. 3A, operating system 109 runs on top of hardware 110, and virtual machine 105 runs on top of operating system 109. Executing within virtual machine 105 are multiple applications and/or applets, such as applet1 (300) and applet2 (301). Applet1 and applet2 may each comprise one or more bytecode class files. A linked library (LIB) 302 is associated with applet2 to support native methods. Library 302 is loaded and linked at the time the class of applet2 that contains the associated native methods is instantiated within virtual machine 105. The native code of library 302 runs directly on top of operating system 109, which supports the library linking facility, and hardware 110.
Multiple threads of execution are handled within virtual machine 105. For example, applet1 may have two threads, T1 and T2; applet2 may have two threads, T5 and T6; and the virtual machine itself may have two threads, T3 and T4, that carry out processes of the virtual machine, such as garbage collection. Threads T1-T6 are managed and scheduled by VM thread scheduler 303 within virtual machine 105. VM thread scheduler 303 selects, based on priorities and time-slicing methods for example, which thread of the group T1-T6 is to be the currently executing thread of the virtual machine, TVM, at the operating system level.
Java virtual machines typically support "cooperative scheduling" wherein executing threads yield processing resources to other threads at certain intervals, or when there is likely to be a delay associated with execution of the current thread. For example, a higher priority thread may take advantage of a yield operation to preempt the current thread. Yielding of processor resources need not be explicitly programmed in standard methods. The virtual machine may insert yields into the interpreting process or into the compiled code at suitable points in execution, such as at method calls and within loops (e.g., at backward branches), to implement cooperative scheduling.
Operating system 109 may serve many threads at any one time, including the selected virtual machine thread TVM. For example, operating system 109 may contain threads TA-TZ supporting other applications or other processes of the operating system. OS thread scheduler 304 determines which thread from the group TA-TZ and TVM is to be executed by the underlying hardware 110 at any given time. If hardware 110 supports multiple processors, multiple threads may be scheduled by OS thread scheduler 304 to execute simultaneously on different processors.
In the implementation of FIG. 3A, a virtual machine thread (e.g., T1-T6) may transfer execution control to a linked library (e.g., LIB 302) to perform a function for a native method, e.g., thread T6 may invoke a native method of applet2 that is supported by native code in library 302, as shown. Thread T6 is able to pass control over to library 302 because thread T6 is currently being passed through to operating system 109 as virtual machine thread TVM. Other threads of the virtual machine must wait for thread T6 to yield in accordance with cooperative scheduling.
However, the transfer of control to library 302 can give rise to virtual machine execution problems. Classes executing in the virtual machine typically call only methods of other classes, and do not, as a rule, make calls directly to the system. Native code, however, depending on its function, can make frequent system calls that block. Because the native code is executed independently as compiled code in a linked library, the virtual machine interpreter and compiler are bypassed, and cannot enforce cooperative scheduling until control is returned to a standard method. The virtual machine must therefore rely on the native code programmer to provide explicit yield() calls in the native code.
If the native code of library 302 makes a blocking system call, such as an I/O call to download a file, thread T6 within the virtual machine, and thus thread TVM at the operating system level, will block until the system call is completed, e.g., until the downloading is finished. The entire virtual machine execution, is also blocked for the duration of the system call as execution control is maintained by the native code of library 302. As blocking system calls may take a relatively long time to complete, it is undesirable for all threads of virtual machine 109 to be blocked as well. The performance of applet1, applet2 and virtual machine 105 may be diminished by blocking system calls of library 302. For this reason, many virtual machine implementations use native threading as shown in FIG. 3B.
In FIG. 3B, VM thread scheduler 303 implements multiple threads of the virtual machine as threads at the operating system level. These threads are labeled as threads TVM1-TVMn. VM thread scheduler 303 determines which virtual machine threads (T1-T6) are passed through to operating system 109 as OS threads TVM1-TVMn at any given time. In the extreme case where each thread of virtual machine 105 is implemented as an individual thread of the underlying operating system 109, virtual machine 105 may forego implementing VM thread scheduler 303, and may rely completely on OS thread scheduler 304 for thread scheduling.
The implementation of FIG. 3B permits multiple threads to be concurrently active in virtual machine 105. This means that a blocking system call by the native code of library 302 does not result in a complete block of virtual machine 105. Rather, one thread of the group TVM1-TVMn, the thread that passed control to library 302 (i.e., the operating system thread corresponding to virtual machine thread T6), is blocked, but the remainder of threads TVM1-TVMn are free to execute.
However, by implementing multiple threads of the virtual machine as OS or native threads, virtual machine 105 effectively cedes control over the scheduling of the threads in the virtual machine from VM thread scheduler 303 to OS thread scheduler 304. Synchronization errors may occur between threads of the virtual machine due to the relative lack of control exerted by the VM thread scheduler 303. To complicate matters, due to the reliance of native threading upon OS thread scheduler 304, synchronization errors may not occur, or may occur in a different manner, when virtual machine 105 and applet1 and applet2 are executed on a different operating system 109 and/or different hardware 110 having different timing parameters and scheduling processes. Thus, errors may not be easily repeatable, and debugging of the system is made more complicated.
Object-Oriented Programming
A general description of object-oriented programming principles is provided below for reference purposes. Object-oriented programming is a method of creating computer programs by combining certain fundamental building blocks, and creating relationships among and between the building blocks. The building blocks in object-oriented programming systems are called "objects." An object is a programming unit that groups together a data structure (one or more instance variables) and the operations (methods) that can use or affect that data. Thus, an object consists of data and one or more operations or procedures that can be performed on that data. The joining of data and operations into a unitary building block is called "encapsulation."
An object can be instructed to perform one of its methods when it receives a "message." A message is a command or instruction sent to the object to execute a certain method. A message consists of a method selection (e.g., method name) and zero or more arguments. A message tells the receiving object what operations to perform.
One advantage of object-oriented programming is the way in which methods are invoked. When a message is sent to an object, it is not necessary for the message to instruct the object how to perform a certain method. It is only necessary to request that the object execute the method. This greatly simplifies program development.
Object-oriented programming languages are predominantly based on a "class" scheme. An example of a class-based object-oriented programming scheme is generally described in "Smalltalk-80: The Language," by Adele Goldberg and David Robson, published by Addison-Wesley Publishing Company, 1989.
A class defines a type of object that typically includes both fields (e.g., variables) and methods for the class. An object class is used to create a particular instance of an object. An instance of an object class includes the variables and methods defined for the class. Multiple instances of the same class can be created from an object class. Each instance that is created from the object class is said to be of the same type or class.
To illustrate, an employee object class can include "name" and "salary" instance variables and a "set_salary" method. Instances of the employee object class can be created, or instantiated, for each employee in an organization. Each object instance is said to be of type "employee." Each employee object instance includes "name" and "salary" instance variables and the "set_salary" method. The values associated with the "name" and "salary" variables in each employee object instance contain the name and salary of an employee in the organization. A message can be sent to an employee's employee object instance to invoke the "set_salary" method to modify the employee's salary (i.e., the value associated with the "salary" variable in the employee's employee object).
A hierarchy of classes can be defined such that an object class definition has one or more subclasses. A subclass inherits its parent's (and grandparent's etc.) definition. Each subclass in the hierarchy may add to or modify the behavior specified by its parent class. Some object-oriented programming languages support multiple inheritance where a subclass may inherit a class definition from more than one parent class. Other programming languages, such as the Java programming language, support only single inheritance, where a subclass is limited to inheriting the class definition of only one parent class. The Java programming language also provides a mechanism known as an "interface" which comprises a set of constant and abstract method declarations. An object class can implement the abstract methods defined in an interface. Both single and multiple inheritance are available to an interface. That is, an interface can inherit an interface definition from more than one parent interface.
An object is a generic term that is used in the object-oriented programming environment to refer to a module that contains related code and variables. A software application can be written using an object-oriented programming language whereby the program's functionality is implemented using objects.