This invention relates to capabilities for managing shared resources in a computer system, and more particularly, to a thread-safe distributed consensus technique, for example, for implementation within a message passing interface (MPI) library.
In order to better understand the background of the subject invention, explanation of certain terminology is first provided. A term well-known in the art as a symmetric multi-processor (SMP) refers to an aspect of hardware in a computing system and, more particularly, relates to the physical layout and design of the processor planar itself. Such multiple processor units have, as one characteristic, the sharing of global memory as well as equal access to input/output (I/O) of the SMP system.
Another term which is commonly associated with modern complex computing systems is a xe2x80x9cthreadxe2x80x9d. The term xe2x80x9cthreadxe2x80x9d in a general sense refers merely to a simple execution path through application software and the kernel of an operating system executing with a computer. As is well understood in the art, it is commonplace for multiple such threads to be allowed per a single process image.
A thread standard has now been incorporated into the POSIX standard. Basic thread management under the POSIX standard is described, for example, in a publication by K. Robbins and S. Robbins entitled Practical UNIX Programmingxe2x80x94A Guide to Concurrency. Communication and Multi-Threading, Prentice Hall PTR (1996).
Another concept which is utilized herein in describing an embodiment of the invention is one of xe2x80x9clocksxe2x80x9d. It is typical in modern computing systems to have critical sections of code or shared data structures, such as shared libraries, whose integrity is extremely important to the correct operation of the system. Locks are, in general, devices employed in software (or hardware) to xe2x80x9cserializexe2x80x9d access to these critical sections of code and/or shared data structures.
One other term to note is the concept of code being multithread-safe. Code is considered to be thread/MP-safe if multiple execution threads contending for the same resource or routine are serialized such that data integrity is ensured for all threads. One way of effecting this is by means of the aforementioned locks.
By way of further background, the message passing interface (MPI) standard defines the following semantic: that processes in a parallel job exchange messages within a communication domain (also referred to herein as a xe2x80x9ccommunicatorxe2x80x9d) which guarantees the integrity of messages within that domain. Messages issued in one domain do not interfere with messages issued in another. Once the parallel job begins, subsets of the processes may collaborate to form separate communication domains as needed.
Applicants recognize that a problem arises in a multithread environment wherein multiple threads may concurrently be trying to obtain a communication domain. Without a way to address this issue, deadlock could occur. Thus, a deterministic, non-deadlocking technique to achieving a distributed consensus in a multithreaded processing system is needed. The present invention is provided as one technique for addressing this need.
To briefly summarize, presented herein in one aspect is a method for establishing a communicator across multiple processes of a multithreaded computer environment wherein multiple threads may be simultaneously trying to establish communicators. The method includes: communicating across the multiple processes to establish a candidate identifier for the communicator for a group of participating threads spread over the multiple processes; and communicating across the multiple processes to check at each participating thread of the multiple processes whether the candidate identifier can be claimed at its process, and if so, claiming the candidate identifier as a new identifier thereby establishing the communicator.
In another aspect, a system is provided for establishing a communicator across multiple processes in a multithreaded computer environment wherein multiple threads may be simultaneously trying to establish communicators. The system includes means for communicating across the multiple processes to establish a candidate identifier for the communicator for a group of participating threads spread over the multiple processes; and means for communicating across the multiple processes to check at each participating thread of the multiple processes whether the candidate identifier can be claimed at its process, and if so, claiming the candidate identifier as a new identifier thereby establishing the communicator.
In a further aspect, the invention comprises at least one program storage device readable by a machine, tangibly embodying at least one program of instructions executable by the machine to perform a method for establishing a communicator across multiple processes in a multithreaded computer environment wherein multiple threads may be simultaneously trying to establish communicators. The method includes: communicating across the multiple processes to establish a candidate identifier for the communicator for a group of participating threads spread over the multiple processes; and communicating across the multiple processes to check at each participating thread of the multiple processes whether the candidate identifier can be claimed at its process, and if so, claiming the candidate identifier as a new identifier thereby establishing the communicator.
To restate, a technique is presented herein for achieving distributed consensus in a multithreaded multiprocess computing environment. The technique is deterministic since the threads will succeed in creating a communicator in a bounded number of retries and in a predictable order. This is believed advantageous over so called xe2x80x9crandomizedxe2x80x9d algorithms in which threads that fail to create a communicator simply wait for a random amount of time before retrying. In addition, the technique presented herein is guaranteed to avoid deadlock between threads. This is advantageous over other algorithms which detect a deadlock situation and then take action to break the deadlock. Deadlock detection typically involves noticing that some period of time has elapsed in which no thread is proceeding. The distributed consensus capability presented herein is efficient since there is an upper bound on the number of times a thread is forced to retry to achieve distributed consensus for a new communication domain notwithstanding the existence of multiple groups of threads trying simultaneously to create communicators.