1. Field:
The present invention relates to parallel programs, more particularly to a deadlock detection method and system for parallel programs.
2. Description of the Related Art
With the rapid development of computer technology, a single-core processor is gradually replaced by a multi-core processor. The multi-core processor significantly improves processing capability and computing capability of a computer, and sufficiently presents the advantage of parallel computing by integrating a plurality of execution cores into one physical processor. So called parallel computing comprises time parallel and space parallel processing, in which the time parallel is related with pipeline technology while the space parallel is related with the parallel technology performed by multiple processors. Generally, parallel computing is implemented by parallel programs in which processing of a task is separated into multiple portions (threads) and these threads can be executed in parallel, and they can communicate with each other by accessing shared data structures and using proper synchronization methods so as to work cooperatively and correctly.
However, process (thread) deadlock is a vital problem for parallel programs. Process (thread) deadlock is a situation wherein two or more processes (threads) are waiting for the other to finish due to a competing shared resource during execution, and two transactions in deadlock will wait indefinitely unless one process (thread) in deadlock gives up the shared resource. Generally, process (thread) deadlock will result in a breakdown of the whole system. There are a lot of factors that will trigger process (thread) deadlock, mainly including: (1) limited system resource; (2) improper order of processes (threads) execution; and (3) unsuitable resource allocation. If system resources are abundant and the parallel processes' requests for resources can all be met, the possibility that a deadlock occurs is low. Otherwise, a deadlock may occur due to competing limited resources. Further, the order and speed of advancement of process execution is different, which may also cause deadlock. In order to avoid damage on the system due to process (thread) deadlock and improve stability of the system, there is a need for an efficient method to detect deadlock such that process (thread) deadlock can be found in time and proper measures can be taken to release the deadlock, thereby preventing the operational condition of the system from further deterioration.
Typically, a lock graph is utilized to intuitively represent a deadlock condition. A lock graph corresponds to a running procedure of parallel programs that may be acquired by recording lock operations in the running procedure of the parallel programs, and nodes and directed edges are accordingly added in the lock graph. In the lock graph, a node denotes a lock for resource, and a directed edge pointing from one node to another denotes that a process holding a lock is requesting to acquire a lock of another resource. If directed edges between two or more nodes in a lock graph form a closed directed loop, there is a deadlock in the parallel programs, and thus deadlock can be detected by checking whether there is a directed loop in the lock graph. FIG. 1 shows a diagram of deadlock state of parallel programs, in which thread T1 has acquired a lock of resource R1 and requests a lock of resource R2, thread T2 has the lock of resource R2 and requests the lock of resource R1. Both threads need to acquire the resource held by the other thread for further processing. However, the resources held by T1 or T2 will not be released until the other thread releases its resources, and thus they fall in a deadlock state.
However, in practice, it is not very efficient to apply the above approach in deadlock detection since, as the program runs, more and more nodes and edges are added into the lock graph. FIG. 2 shows an example of a lock graph of parallel programs in which there are 1014 nodes and 3051 directed edges, and the operation of detecting directed loop in the lock graph is very slow, thereby consuming large amounts of time and computing resources, and greatly reducing efficiency of deadlock detection.
Thus, there is a need for an improved deadlock detection method to enhance efficiency of deadlock detection.