1. Field of the Invention
The invention relates to the field of computer processing. More specifically, the invention relates to a method and apparatus of parallel computation.
2. Background of the Invention
Parallel computing of tasks achieves faster execution and/or enables the performance of complex tasks that single processor systems cannot perform. One paradigm for performing parallel computing is shared-memory programming. The OpenMP standard is an agreed upon industry standard for programming shared-memory architectures.
OpenMP provides for synchronization constructs. One of these constructs ensures that memory locations are updated atomically: the ATOMIC construct. An atomic update operation is a sequence of operations or instructions that are non-interruptible to update a memory location. Atomically updating a shared memory location prevents multiple threads of a team from performing the same operation and/or destroying work done by another thread.
Although OpenMP outlines requirements for constructs and provides guidelines for parallel programming, details for implementing the ATOMIC construct are not provided. One method for atomically updating a shared memory location is to acquire locks on the memory location in order to limit modification of the shared memory location to the lock holder. Although this method ensures atomic updating of the memory location, the updating thread reduces performance with the acquisition and release of locks on the memory location. In addition, performance is reduced since other threads of the updating thread""s team must wait to update the memory location until the lock on the memory location is released.
Another method to atomically update a memory location that achieves optimal performance is to create platform specific low-level instructions to perform the update. Although vendors can optimize their system with such low-level instructions, the cost to produce the low-level instructions can become a combinatorial explosion. To support such an implementation, vendors would create a number of low-level instructions proportional to the product of the number of data-types, the number of sizes of data-types, and the number of operations to be supported by a single platform. Hence, the cost of creating low-level instructions to support atomic updates for numerous operations is prohibitive on a single platform. This prohibitive cost makes a multiple-platform implementation of atomic update operations, based solely on platform specific low-level instructions, infeasible.