1. Technical Field:
The present invention relates generally to an improved data processing system, and in particular, to a method and apparatus for processing data. Still more particularly, the present invention provides a method and apparatus for managing threads executing in a data processing system.
2. Description of Related Art:
Checkpoints are often used in a method for recovering from a system failure. A checkpoint is a copy of the state of the data processing system, which is periodically saved. This state includes, for example, the contents of the memory in the data processing system as well as current register settings. These register settings may include, for example, the last executed instruction. In the event of a failure, the last checkpoint may serve as a recovery point. A restart program may copy the last checkpoint into memory, reset the hardware registers, and start the data processing system from that checkpoint.
A checkpoint is thus used to save the state of the processes of an application. A process is the execution state of a program. Often a process can be broken into multiple execution states, which can run in parallel. Each of these execution states share the same data and global state, such as open files, shared memory, and program text, etc., however, they have their own execution context with their own stack and registers. These are called threads of a process. When multiple threads in a user space are multiplexed to run on a single kernel thread, the user threads are called lightweight processes. In Advanced Interactive Executive (AIX), these threads also are referred to as pthreads and the library that handles the switching of pthreads in a user space is the pthreads library. From the kernel perspective, there is only a single thread, however, the pthreads library may run several pthreads on a single kernel thread. The two popular models are the M:N model, where ‘N’ pthreads are serviced by (or multiplexed on) ‘M’ kernel threads and M is usually less than N, or the 1:1 model where there is one kernel thread for each pthread.
Processes often require special handling at checkpoint and restart time, which is handled by running application handlers, one at checkpoint time and one at restart time. These are usually implemented as signal handlers or event handlers, which execute under the context of one of the threads of the process. The thread is interrupted from its current execution, its execution state is saved, and control is passed to the handler. When the handler completes, the state of the thread is restored and the thread resumes execution from the point it was interrupted. A process will need to register for handlers if the process owns non-checkpoint safe resources like Internet sockets whose complete state cannot be saved in the checkpoint file because the other end of the socket is on a different system. In this case, the handler can save the details of the socket at checkpoint time, reopen the socket at restart time, and perform any other initialization necessary to restore the socket to the state it was at checkpoint time. Checkpoint handlers may also be needed to convert the process into a checkpointable state.
Currently, signal handlers and checkpoint handlers, which are usually implemented as signal handlers, are restricted to a limited set of application program interface calls (API) or system calls, which do not require taking of any internal pthread locks. The restricted set of calls is currently used because if a thread is interrupted to handle a signal and the thread is in the middle of an API that has taken a lock and the signal handler invokes the same API, a deadlock occurs. This deadlock occurs because the signal handler would block and wait for the lock to be released. A “lock” is used to prevent other threads or processes from accessing a resource, such as a memory location or a register. The lock owner, the interrupted thread, will block waiting for the signal handler to complete, resulting in a deadlock. This same situation exists in the case in which the signal handler tries to acquire a mutex. A “mutex” is a programming flag used to grab and release an object. A mutex may be set to lock such that other attempts to use the object are blocked. A mutex is set to unlock when the data is no longer needed or the routine finishes.
Signal handlers, checkpoint and restart handlers are examples of procedures. A procedure is a series of steps followed in a regular definite order, for example a legal procedure or a surgical procedure. In computer systems, a procedure is a series of instructions that has a name by which the procedure can be called into action.
Many calls used in a data processing system will take internal locks when running in a multithreaded state to serialize execution. For example, the call “malloc” takes a lock to protect its internal heap structure. The problem of deadlocking is a bigger problem with respect to checkpoint/restart processes. Specifically, requirements for a checkpoint handler are usually more complex than a signal handler. As described earlier, the purpose of the checkpoint handler is often to make a process quiescent to enable it be checkpointed. For parallel applications that execute across many nodes, this requirement may involve making calls to the MPI (or Message Passing Interface library) and the LAPI (or Low-Level Application Programming Interface, a high-performance communication library on IBM SP systems) subsystems and closing devices that cannot be checkpointed like Internet sockets. It is often impossible to code these calls without taking mutexes or making non-thread safe calls.
In addition, with respect to deadlocks, at restart time, the restart handler is called before the rest of the application threads start running. This is to handle resources that were not checkpointed by the system and hence not restored automatically, such as Internet sockets, devices with non-checkpoint aware device drivers, and pipes to processes outside the group of processes being checkpointed. The primary task of the restart handler is to restore the state of the application such that threads using these resources run successfully and do not have to be aware of checkpoint-restart happening asynchronously.
With these requirements, the threads in a process are suspended until the restart handler completes execution or exits. Hence if any of the threads were in the middle of an API call that took a lock or owned a mutex and the restart handler invoked the same API call or tried to acquire the same mutex, these threads would block the use of that resource indefinitely causing a deadlock.
Therefore, it would be advantageous to have an improved method, apparatus, and computer instructions for checkpoint and restart handlers in multi-threaded processes to avoid deadlocks.