Java Overview
The number of application programs written in object-oriented languages, such as Java, is growing rapidly in number. One of the key reasons for the popularity of Java is the portability of Java code. A brief overview of Java is given below. Note that specifications for the Java language and the Java Virtual Machine have been released by Sun Microsystems, Inc.
The Java language is an object-oriented programming language. Java programs are compiled to run on a Java Virtual Machine (VM). A Java VM is a computer system which runs on top of the existing hardware and operating system of another computer system. Because the specifications for the Java VM have been published, it is possible to write a Java VM to work with any hardware and/or operating system. Java programs are compiled into bytecode, which will run on any Java VM. The Java VM essentially acts as an interpreter between the Java bytecodes and the system on which the Java program is executing.
There are four major components to a Java VM, all of which are implemented in software. The four components are the registers, the operand stack, the Java heap (sometimes referred to as the garbage-collected heap), and the method area. The method area contains the method code (i.e. the compiled Java code) and symbol tables. The compiled Java code, i.e. the bytecode, consists of a set of instructions. Each instruction consists of a one byte opcode, followed by any needed operands.
Compiled Java programs are typically referred to as Java class files. Many Java class files are downloaded from the Internet for execution on a user's computer system. One of the first steps performed by a Java VM is called verification. A class-file verifier (part of the Java VM) ensures that the file truly is a Java class file and will execute without violating any Java security restrictions.
The class file verifier first checks to determine if the class file being loaded is of the correct class file format. This is done by examining the first four bytes of the class file. All Java class files must begin with the "magic number" (i.e. 0xCAFEBABE). A version number follows the magic number, and the class file verifier checks to ensure that the class file being loaded is compatible with the VM loading it. The verifier also checks the information in the constant pool and other sections of the class file for consistency.
During the linking phase, the verifier ensures that all classes except for the Object class have a superclass, and that all field and method references in the constant pool have valid names, classes, and type descriptors. In addition, the verifier checks the code array of the code attribute for each method to ensure that all local variables contain values of the appropriate type, that methods are called with the appropriate arguments, and that fields are assigned correct values. The verifier also checks the operand stack for correctness.
Finally, during execution, the verifier checks to ensure that a referenced type is allowed for instructions referencing a type. If an instruction modifies a field or calls a method, the verifier checks to ensure that the field or method is available and that the calling method is allowed to access the field or call the method.
Objects are created in Java through the use of the "new" operator. During execution, an object is dynamically created, and memory for the object is allocated on the Java heap. The memory space allocated for an object includes a header area and a data area (for storing the object's data).
The Need For Synchronization
Most programming languages, including most object-oriented programming languages, such as Java, provide the capability to synchronize shared data. Synchronization of shared data allows the language to support multi-threaded applications. The synchronization capability allows a thread to safely reference and update data which may be shared by more than one thread. Synchronization is needed in order to prevent a condition known as a race condition. An example of a race condition is shown below. For illustrative purposes, the example is shown using Java code. However, race conditions can occur in any information handling system, regardless of the programming language or languages used to implement programs in the system.
Assume a program, P1, creates a public object, O1. O1 contains a data element, O1.data1, which is initially set to zero. Further assume that program P1 creates two threads, T1 and T2, each of which has access to object O1. Threads T1 and T2 each include the following logic:
if (O1.data1==0)
{ PA1 tmp=O1.data1; PA1 tmp=tmp+10; PA1 O1.data1=tmp; PA1 . . . //do other work PA1 }Without synchronization, T1 and T2 could both attempt to test, and then update, O1.data1 at the same time. This condition is referred to as a race condition, and will lead to unpredictable results when program P1 is executed.
The Use of Monitors for Synchronization
Several techniques have been used, both in hardware and software design, to ensure that race conditions do not occur. Some object-oriented programming languages use a programming language construct, referred to as a monitor, to prevent race conditions. One prior art approach, currently supported in Java, uses a system-wide pool of monitors, which are accessed and released as needed by all objects in the system.
A monitor is logically associated with an object. However, in the prior art, a monitor is not bound to the object. Rather, the monitor encapsulates variables, access procedures, and initialization code within an abstract data type. Threads may only access shared data in an object associated with the monitor via the monitor's access procedures, and only one thread may access the monitor at any one time. A monitor may be thought of as a "wrapper" around an operating system semaphore.
FIG. 1 is a flow chart depicting the use of a system-wide monitor pool to ensure data synchronization. As shown in FIG. 1, when a thread requests access to an object's shared data (step 40), the system-wide monitor pool is locked (step 42). This is done to prevent multiple threads from accessing the same monitor concurrently. A hash algorithm is used to look up one of the monitors from the monitor pool (step 44). The object then acquires, or enters, the monitor (step 46). Part of the logic for entering a monitor is to acquire an operating system semaphore, which is stored as part of the monitor. The system-wide monitor pool is then unlocked (step 48).
The shared data is then acted upon by the thread (step 50). After the thread has finished acting on the shared data, it again locks the system-wide monitor pool (step 52), and uses the same hash algorithm used in step 44 to look up the monitor assigned to the object (step 54). The object then releases, or exits, the monitor (step 56). Part of the logic for exiting a monitor is to release the operating system semaphore stored as part of the monitor. Finally, the system-wide monitor pool is unlocked (step 58).
While the use of monitors prevents race conditions from occurring, this approach can significantly degrade the performance of the information handling system. The use of monitors is a time-consuming process. Every time a monitor is needed by an object, the system-wide monitor pool is locked and unlocked (steps 42 and 48). When an object no longer needs the monitor, the monitor pool is again locked and unlocked (steps 52 and 58). Locking and unlocking the system-wide monitor pool takes a significant amount of time. Further, while the monitor pool is locked, all other threads are prevented from accessing the monitor pool. Any threads requiring shared data synchronization are effectively stopped until the monitor pool is unlocked. This significantly impacts the performance of the system, especially in a multiprocessor environment.
Another problem with the use of monitors is that a monitor is effectively a wrapper around an operating system semaphore. The use of an operating system semaphore requires calls to the operating system, which significantly impacts the performance of the process which is executing. In addition, the monitor structure contains information which is redundant with the operating system semaphore, such as the owning thread and recursion count. Maintaining this information in two data structures is unnecessary and adds additional overhead to the system.
Consequently, it would be desirable to have a system and method for providing shared data synchronization in an object-oriented environment, in a manner which is efficient and uses little system overhead. It would be desirable to eliminate global locks, and to minimize the use of operating system semaphores used to provide synchronization. It would also be desirable if the synchronization capability could be provided for an object in a manner which is "seamless" to the object's definition (i.e. the application programmer does not have to change the definition of an object to incorporate the synchronization method). It would further be desirable to provide data synchronization at run-time, so that existing code does not have to be re-compiled.