A virtual machine is software that acts as an interface between a computer program that has been compiled into instructions understood by the virtual machine and the microprocessor (or “hardware platform”) that actually performs the program's instructions. Once a virtual machine has been provided for a platform, any program compiled for that virtual machine can run on that platform.
One popular virtual machine is known as the Java virtual machine (VM). The Java virtual machine specification defines an abstract rather than a real “machine” (or processor) and specifies an instruction set, a set of registers, a stack, a “garbage-collected heap,” and a method area. The real implementation of this abstract or logically defined processor can be in other code that is recognized by the real processor or be built into the microchip processor itself.
The output of “compiling” a Java source program (a set of Java language statements) is called bytecode. A Java virtual machine can either interpret the bytecode one instruction at a time (mapping it to one or more real microprocessor instructions) or the bytecode can be compiled further for the real microprocessor using what is called a just-in-time (JIT) compiler.
The Java programming language supports multi-threading, and therefore Java virtual machines must incorporate multi-threading capabilities. Multi-threaded computing environments allow different parts of a program, known as threads, to execute simultaneously. In recent years, multithreaded computing environments have become more popular because of the favorable performance characteristics provided by multi-threaded applications.
Compared to the execution of processes in a multiprocessing environment, the execution of threads may be started and stopped very quickly because there is less run-time state to save and restore. The ability to quickly switch between threads can provide a relatively high level of data concurrency. In the context of a multi-threaded environment, data concurrency refers to the ability for multiple threads to concurrently access the same data. When the multi-threaded environment is a multi-processor system, each thread may be executed on a separate processor, thus allowing multiple threads to access shared data simultaneously.
Java is gaining acceptance as a language for enterprise computing. In an enterprise environment, the Java programs may run as part of a large-scale server to which many users have concurrent access. A Java virtual machine with multi-threading capabilities may spawn or destroy threads as necessary to handle the current workload. For example, a multi-threading Java virtual machine may be executing a first Java program in a first thread. While the first Java program is executing, the server may receive a request to execute a second Java program. Under these circumstances, the server may respond to the request by causing the Java virtual machine to spawn a second thread for executing the second Java program.
Despite the favorable performance characteristics provided by multi-threaded computing environments, they are not without their disadvantages. Specifically, in multithreaded applications, maintaining the integrity of data structures and variables can be particularly challenging since more than one thread can access the same data simultaneously. Unlike processes in multiprocessing environments, threads typically share a single address space and a set of global variables and are primarily distinguished by the value of their program counters and stack pointers. Consequently, the state of some commonly accessible data can be undergoing a change by one thread at the same time that it is being read by another thread, thus making the data unreliable.
Typically, servers that incorporate multi-threading Java virtual machines are configured to spawn a separate thread for each user session. For example, a web server may execute a thread that listens for a connection to be established (e.g. an HTTP request to arrive) through a particular port. When a connection is established, the listening thread passes the connection to another thread. The selected thread services the request, sends any results of the service back to the client, and blocks again, awaiting another connection. Alternatively, each socket through which a connection may be established may be assigned to a specific thread, and all connections made through a given socket are serviced by the associated thread.
Because the threads execute within the same Java virtual machine, the user sessions share the state information required by the virtual machine. Such state information includes, for example, the bytecode for all of the system classes. While such state sharing tends to reduce the resource overhead required to concurrently service the requests, it presents reliability and security problems. Specifically, the bytecode being executed for first user in a first thread has access to information and resources that are shared with the bytecode being executed by a second user in a second thread. If either thread modifies or corrupts the shared information, or monopolizes the resources, the integrity of the other thread may be compromised.
To avoid such problems, the designer of a Java program that is going to be used in a multi-user server environment where each session is assigned a separate thread must implement the Java program in a way that avoids altering shared state or conflict over resources with other Java programs executing in other threads within the same virtual machine. Unfortunately, the provider of the server may have little control over how Java program designers implement their Java programs.
The thread-per-session nature of the server also complicates the task of garbage collection. For example, the threads associated with many sessions may be actively performing operations that create and consume resources, while the thread associated with another session may be trying to perform garbage collection within the same pool of resources. The negative impact of the garbage collection operation on the performance of the other threads is such that many implementations avoid the situation in which some threads are working and others are performing garbage collection by synchronizing the performance of garbage collection among all the threads. However, synchronizing the performance of garbage collection also has a negative impact on the performance of the server, causing all threads to cease working at the same time in order for the garbage collection to be performed.
Further, garbage collection in a thread-per-session environment is complicated by the fact that one of the sessions may encounter problems that cause the session to stall. The resources allocated for that session are not garbage collected because “live” pointers to those resources are not distinguishable from the live pointers associated with threads that have not stalled.
In a thread-per-session environment, threads share access to modifiable data items. To maintain the integrity of such data items, a memory manager is used to serialize access. Unfortunately, the memory manager within a thread-per-session system can very quickly become a bottleneck. In general, the larger the number of threads, the greater the contention between the threads for control of resources managed by the memory manager. Consequently, conventional VM servers that are implemented using a t-per-session approach tend to suffer severe performance penalties as the number of sessions grows beyond a certain threshold.
Based on the foregoing, it is clearly desirable to implement a server that allows multiple users to concurrently execute Java programs within sessions established with a server, but which does not rely on the Java program designers to implement their programs in such a way as to make them safe in a multi-threading environment. It is further desirable to provide a scalable server that avoids the performance penalties associated with threads competing for resources.