In the Java programming environment (Java is a trademark of Sun Microsystems Inc), programs are generally run on a virtual machine, rather than directly on hardware. Thus a Java program is typically compiled into byte-code form, and then interpreted by the Java virtual machine (VM) into hardware commands for the platform on which the Java VM is executing. The Java environment is further described in many books, for example “Exploring Java” by Niemeyer and Peck, O'Reilly & Associates, 1996, USA, and “The Java Virtual Machine Specification” by Lindholm and Yellin, Addison-Wedley, 1997, USA.
The Java language supports multiple threads which can run concurrently. As for any concurrent system, it is important to be able to control access to shared resources, to avoid potential conflict between different threads as regards their usage of a particular resource. In the Java language, mutually exclusive access to a shared resource is achieved by means of synchronisation. One of the advantages of the Java language is that this synchronisation is relatively simple for the end-programmer; there is no need at the application level to specifically code lock and unlock operations.
Java VM implementations of synchronisation are generally based on the concept of monitors which can be associated with objects. A monitor can be used for example to exclusively lock a piece of code in an object associated with that monitor, so that only the thread that holds the lock for that object can run that piece of code—other threads will queue waiting for the lock to become free. The monitor can be used to control access to an object representing either a critical section of code or a resource.
Locking in Java is always at the object-level and is achieved by the application applying a “synchronized” statement to those code segments that must run atomically. The statement can be applied either to a whole method, or to a particular block of code within a method. In the former case, when a thread in a first object invokes a synchronised method in a second object, then the thread obtains a lock on that second object. The alternative is to include a synchronised block of code within the method that allows the lock to be held by taking ownership of the lock of an arbitrary object, which is specified in the synchronised command.
The monitor structure in Java can also be used as a communication mechanism between separate threads of execution. This is achieved by a first thread including a “wait” command within synchronised code. This suspends execution of this first thread, and effectively allows another thread to obtain the lock controlling access to this synchronised code. Corresponding to the “wait” command is a “notify” command in synchronised code controlled by the same object lock. On execution of this “notify” command by a second thread, the first thread is resumed, although it will have to wait for access to the lock until this is released by the second thread. Thus when used for this purpose a thread may wait on an object (or event) and another thread can notify the waiter. The “notify” command actually comes in two flavours: a “notify-all”, whereby all the threads waiting on the object are notified, and a simple “notify”, whereby only one (arbitrary) waiting thread is notified.
Although the ability to support concurrent threads greatly increases the power and flexibility of Java programs, it does create a pitfall, commonly termed “deadlock”. This is the situation where one thread (thread A), which owns resource X on a mutually exclusive basis, wants to access another resource Y that is currently owned by a second thread (thread B), also on a mutually exclusive basis. Thus thread A waits for thread B to release Y. However, it is possible that thread B must acquire resource X before it can release resource Y. Since resource X is currently owned by thread A, thread B must also wait. Unfortunately we are now in a situation where thread A cannot progress until thread B releases resource Y, whilst thread B cannot progress until thread A releases resource X. The result is that neither thread is able to progress, and we have reached a deadlock in which the system is locked, unable to progress (i.e. it has effectively crashed).
Note that although the above example includes only two threads, a more complex cyclic dependency can also produce deadlock. For example, thread A waiting on thread B, thread B waiting on thread C, thread C waiting on thread D, and thread D waiting on thread A (where waiting on a thread here implies waiting for a thread to release a particular resource that it currently owns).
The problem of deadlock is well-known in the literature, both in Java and also in other languages, see for example: “Java Deadlock”, by Vermeulen, p52, 54–56, 88–89, in Dr Dobbs Journal, Vol 22/Sep. 9, 1997. It is theoretically possible to avoid deadlock by better program design, and some prior art has focused on how best to achieve this—e.g.: “Modelling Multi-Threading in Java” by Wabenhorst and Potter, p 153–164 in IEEE Proceedings of the Conference: Technology of Object Oriented Languages and Systems (TOOLS 25), Australia, November 1997. A somewhat different approach is described in “Modelling and Validation of Java Multithreading Applications using SPIN” by Dematini, Iosif, and Sisto, p577–603 in Software—Practice and Experience, v 29/7, July 1999, Wiley. Here, program source code is translated into a formal description file, which can then be analysed to find potential deadlocks. However, this approach adds another level of complexity, and it is not clear if it is 100% effective. Modelling to avoid deadlock is also discussed in: “A CSP model for Java multithreading” by Welch and Martin, p 114–122 in IEEE Proceedings of the International Symposium on Software Engineering for Parallel and Distributed systems, June 2000, Ireland.
Despite the above work, deadlock remains a common pitfall for concurrent systems in practice, and eliminating deadlock retrospectively from applications can require significant time and effort. One can classify deadlock situations into two different types. The first is where deadlock arises inevitably as the result of the application logic. This type is relatively easy to detect using the formal tools described above. The second is where the deadlock is essentially accidental, and derives from precise timing considerations of operations in one thread in relation to another thread. This sort of situation is inherently non-deterministic, and may vary from one incarnation of the application to another; in other words, a given application may only lock on certain occasions, and it may be difficult to reproduce this problem for subsequent investigation.
The prior art further discloses various tools available to support analysis work after deadlock has occurred, such as the Probe Thread analyzer (the Probe tool is available from Sitar Software, see in particular the site /software/jprobe/jprobethreadalyzer.html at www.sitraka.com). Although such tools are useful in a development context, they do not generally allow for real-time avoidance of deadlock. This is discussed elsewhere in the literature, for example in: “Dynamic Instrumentation of Threaded Applications” by Xu, Miller and Naim, p 49–59, in Proceedings of the Seventh ACM SIGPLAN symposium on Principles and Practice of Parallel Programming, May 1999, Atlanta, USA. This article describes a mechanism for adding instrumentation (i.e. diagnostic facilities) to individual threads. This raises the problem that the instrumentation may require a lock owned by its corresponding thread, resulting in a form of self-deadlock. This situation is obviated by making locks provide information about current lock owner. The instrumentation is then skipped if this owner is the same as the corresponding thread for the instrumentation, thereby avoiding the deadlock. This technique is only appropriate to self-deadlock in situations involving instrumentation, and cannot be easily generalised to broader application deadlock problems.
“Deadlock Detection and Resolution for Discrete Event Simulation: Multiple-Unit Seizes” by Venkatesh, Smith, Deuemeyer and Curry, p 201–16, in IIE Trans Vol 30/3, March 1998, Chapman & Hall, discusses deadlocks generally in the context of simulation and manufacturing systems, and briefly mentions some deadlock avoidance strategies. These seem to be based primarily on defining required resources, and trying to predict if future use will lead to deadlock. It is not clear if this technique is practicable outside the context of simulation.
“Extending Java to Support Shared Resource Protection and Deadlock Detection in Threads Programming” by Van Engen, Bradshaw, and Oostendorp, ACM Crossroads, v 4.2, Winter 1997 (electronic publication) interposes a program that effectively sits between the application and the Java VM, and extends the basic Java object to provide the facility for deadlock detection. Thus if a thread requests an object that would lead to a deadlock, this is detected by looking for a cyclic pattern of dependencies, and an exception can be thrown back to the application.
Unfortunately, this deadlock detection can only be utilised with applications that have been specifically designed for resource control. This places severe limitations on the freedom of the application designer, due to the single inheritance model of Java, so that all application objects can only be extensions of the resource control set. In addition, the technique cannot be used with existing applications that have not been written for the resource control layer. Furthermore, it is not desirable to modify applications to work with this layer, because the extra overhead involved in this approach at run-time is very considerable, thereby giving markedly reduced performance.