1. Field of the Invention
The present invention relates to computer communication, and deals more particularly with a technique, system, and computer program for enhancing performance of the computers responsible for one side of the communications in a scalable, clustered network environment. This is done by creating a pool of virtual machines, enabling the applications running on the computer to be more scalable, manageable, reliable, secure, and faster.
2. Description of the Related Art
The Internet is a vast collection of computing resources, interconnected as a network, from sites around the world. It is used every day by millions of people. The World Wide Web (referred to herein as the "Web") is that portion of the Internet which uses the HyperText Transfer Protocol ("HTTP") as a protocol for exchanging messages. (Alternatively, the "HTTPS" protocol can be used, where this protocol is a security-enhanced version of HTTP.)
A user of the Internet typically accesses and uses the Internet by establishing a network connection through the services of an Internet Service Provider (ISP). An ISP provides computer users the ability to dial a telephone number using their computer modem (or other connection facility, such as satellite transmission), thereby establishing a connection to a remote computer owned or managed by the ISP. This remote computer then makes services available to the user's computer. Typical services include: providing a search facility to search throughout the interconnected computers of the Internet for items of interest to the user; a browse capability, for displaying information located with the search facility; and an electronic mail facility, with which the user can send and receive mail messages from other computer users.
The user working in a Web environment will have software running on his computer to allow him to create and send requests for information, and to see the results. These functions are typically combined in what is referred to as a "Web browser", or "browser". After the user has created his request using the browser, the request message is sent out into the Internet for processing. The target of the request message is one of the interconnected computers in the Internet network. That computer will receive the message, attempt to find the data satisfying the user's request, format that data for display with the user's browser, and return the formatted response to the browser software running on the user's computer.
This is an example of a client-server model of computing, where the machine at which the user requests information is referred to as the client, and the computer that locates the information and returns it to the client is the server. In the Web environment, the server is referred to as a "Web server". The client-server model may be extended to what is referred to as a "three-tier architecture". This architecture places the Web server in the middle tier, where the added tier typically represents data repositories of information that may be accessed by the Web server as part of the task of processing the client's request. This three-tiered architecture recognizes the fact that many client requests do not simply require the location and return of static data, but require an application program to perform processing of the client's request in order to dynamically create the data to be returned. In this architecture, the Web server may equivalently be referred to as an "application server".
The Java programming language is gaining wide acceptance for writing Web applications, as it is a robust portable object-oriented language defined specifically for the Web environment. ("Java" is a trademark of Sun Microsystems, Inc.) Java attains its portability through use of a specially-designed virtual machine ("VM"). This virtual machine is also referred to as a "Java Virtual Machine", or "JVM". In this context, the purpose of the virtual is machine is to enable isolation of the details of the underlying hardware from the compiler used to compile the Java programming instructions. Those details are supplied by the implementation of the virtual machine, and include such things as whether little Endian or big Endian format is used for storing compiled instructions, and the length of an instruction once it is compiled. Because these machine-dependent details are not reflected in the compiled code, the code can be transported to a different environment (a different hardware machine, a different operating system, etc.), and executed in that environment without requiring the code to be changed or recompiled. The compiled code, referred to as Java "bytecode", then runs on top of a JVM, where the JVM is tailored to that specific environment. As an example of this tailoring of the JVM, if the bytecode is created using little Endian format but is to run on a microprocessor expecting big Endian, then the JVM would be responsible for converting the instructions from the bytecode before passing them to the microprocessor.
A Web server that implements a Java Virtual Machine can be functionally extended using Java "servlets". A servlet is a relatively small executable code object that can be dynamically plugged in, or added, to the code running on the server. Servlets typically perform some specialized function, which can be invoked by the server (or by another servlet) to extend its own functionality. The servlet processes the request, and returns the response to the server (or servlet) that invoked it.
A Java Virtual Machine runs multiple threads within a single process. A process is an instance of a running program, which has state information associated with it such as the current values of registers, the current instruction being executed, file descriptors for files that have been opened by the program, etc. Multiprogramming is accomplished by using multiple threads in this process, where a thread is a single execution of a program supporting concurrent execution (i.e. a re-entrant program). The operating system maintains information about each concurrent thread that enables the threads to share the CPU in time slices, but still be distinguishable from each other. For example, a different current instruction pointer is maintained for each thread, as are the values of registers. Thus, the different threads can execute sequentially within one process.
Any number of servlets can be running within one server, using the server's single process, at any given time. However, as more servlets are invoked, more threads are competing for the allocation of time slices, and performance of the JVM begins to degrade. With a relatively small number of servlets running, the performance of a time-critical application can degrade to the point where the application is effectively unusable. Because Web applications typically have a human user waiting for the response to the client requests, responses must be returned very quickly, or the user will become dissatisfied with the service. A particular server may receive thousands, or even millions, of client requests in a day's time. These requests must all be handled with acceptable response times, or the users may switch to a competitor's application services.
Further, this single-process approach to executing multiple threads can lead to complete unavailability of a server in certain situations. If one of the threads crashes, or hangs the system, as it is executing, then the single executing sequence of code crashes or hangs. Or, if the operating system invokes garbage collection, then execution of application programs to process client requests will halt until the garbage collection is finished. This interruption in service, either temporary (for garbage collection) or complete (when code crashes or hangs), is intolerable in many of today's time-sensitive Web applications.
Additionally, servlets running on a particular JVM may interfere with one another, either intentionally or unintentionally. Because the servlets all run as threads in the same process, the resources of the process are not saved separately for each thread when that thread is swapped out from using the CPU. This enables very fast switching from execution of one servlet (i.e. the thread for a servlet) to another, but at the expense of not isolating one servlet's data from another's. For example, one servlet can overwrite any location in the memory available to the process, even if another servlet depends on the contents of that memory location being unchanged. Or, if one servlet opens a file, all servlets see that file as being opened and can read or write data in the file, even though one servlet may have been written to expect exclusive access to the file contents.
Because all the servlets of a particular server ran in threads on the same JVM, there is currently no way to run more than one execution environment (which includes the version of Java being executed) at the same time. While many implementations will never need more than one environment or Java version, there are other situations where it would be very beneficial to allow a mixture (such as allowing concurrent use of versions from different vendors). For example, some servlets may require a particular virtual machine, such as servlets using Microsoft Corporation's ActiveX: these servlets require the Microsoft virtual machine. If this virtual machine was the only one available, then servlets requiring some different virtual machine (such as from a different vendor) could not run in that environment.
Accordingly, a need exists for a technique by which these shortcomings in the current implementation of virtual machines on servers can be overcome. The proposed technique defines a way to use multiple virtual machines, referred to as a "pool" of virtual machines, within a single server. This technique enables the number of servlets executing on behalf of a single server to be increased without degrading the server's performance. Further, it enables a protection mechanism to be implemented that prohibits the servlets of one application from interfering with the servlets of another application, and allows different execution environments (including different versions of Java) to be used concurrently. By executing multiple virtual machines per server, a server will no longer be completely unavailable if one thread crashes, hangs, or is interrupted.