Computing system technology has advanced at a remarkable pace recently, with each subsequent generation of computing system increasing in performance, functionality, and storage capacity, often at reduced cost. However, individual computing systems are still generally expensive and incapable of providing the raw computing power that is often required by modern requirements for computing power. One particular type of computing system architecture that generally fills this requirement is that of a parallel processing computing system. Each parallel processing computing system is often referred to as a “supercomputer.”
Generally, a parallel processing computing system comprises a plurality of computing cores and is configured with a distributed application. Some parallel processing computing systems, which may also be referred to as massively parallel processing computing systems, may have hundreds or thousands of individual computing cores, and provide supercomputer class performance. Each computing core is typically of modest computing power and generally includes one or more processing units. Each computing core may be incorporated into a dedicated processing node, or each computing core may be a computing system. The distributed application provides work for each computing core and is operable to control the workload of the parallel processing computing system. Generally speaking, the distributed application provides the parallel processing computing system with a workload that can be divided into a plurality of tasks. Each computing node is typically configured to process one task. However, each task is typically further divided into one or more execution contexts, where each computing core of each computing node is typically configured to process one execution context and therefore process, or perform, a specific function. Thus, the parallel processing architecture enables the parallel processing computing system to receive a workload, then configure the computing cores to cooperatively perform one or more tasks and/or configure computing cores to process one execution contexts such that the workload supplied by the distributed application is processed.
Parallel processing computing systems have found application in numerous different computing scenarios, particularly those requiring high performance and fault tolerance. For instance, airlines rely on parallel processing to process customer information, forecast demand, and decide what fares to charge. The medical community uses parallel processing computing systems to analyze magnetic resonance images and to study models of bone implant systems. As such, parallel processing computing systems typically perform most efficiently on work that contains several computations that can be performed at once, as opposed to work that must be performed serially. The overall performance of the parallel processing computing system is increased because multiple computing cores can handle a larger number of tasks in parallel than could a single computing system. Other advantages of some parallel processing systems include their scalable nature, their modular nature, and their improved level of redundancy.
When sharing resources among the computing cores, and particularly the execution contexts of computing cores, parallel processing computing systems of the prior art are typically limited by the Critical Section Problem. Generally speaking, the Critical Section Problem arises when multiple entities are competing for the same resource. When shared resources are manipulated, the entity manipulating the resource is within a “Critical Section” during which that time the entity, and only that entity, must be able to manipulate the data without interference. Interference from other entities would introduce inconsistency to, or otherwise corrupt, the resource. To provide consistency within a resource of a parallel processing computing system, the Critical Section Problem requires that only one execution context be provided access to the resource at any given time. In conventional parallel processing computing systems, the Critical Section is not merely “manipulation” of the resource, but rather “access” of the resource by an execution context, as the execution contexts often need to manipulate the resource just to access the resource.
For example, and in one general embodiment, the resource may be a block of data, and the conventional approach to the Critical Section Problem is to allow only one execution context to access or manipulate that block of data. As such, conventional parallel processing computing systems provide a “lock” on the resource to the first execution context that attempts to access the resource. Other execution contexts are prevented from even accessing the resource, although they may be granted a lock on a first-come-first-served basis.
However, by locking a resource, conventional parallel processing computing systems often break concurrency, and force execution contexts and computing cores to wait in line in a serial manner for access to the resource. As such, computing cores typically stall as they wait to lock the resource, defeating the advantages, and purpose, of implementing a parallel processing computing system entirely.
Consequently, there is a continuing need to access a resource in such a manner that execution contexts denied locks on the resource may continue processing in parallel and without breaking consistent access to the resource.