Multi-core systems and computing farms are becoming increasingly common. This is due, in part, to the increased complexity of single processor systems. Many vendors of desktop computing systems have already shipped multi-core versions of their products, and in the future for there will be dozens of processing cores on a single chip. Also, computing farms are becoming increasingly common due to the improved cost benefit ratio of using many low-cost, low-performance commodity systems instead of using a small number of high-cost, high-performance systems.
Computationally intensive applications, such as many electronic design automation (EDA) applications, that cannot take advantage of parallel execution are at a significant disadvantage in the marketplace. As vendors produce concurrent execution versions of their products to take advantage of parallel execution, there is increasing market pressure for competitors to follow.
The classic example of concurrency is moving funds between two or more bank accounts. If proper care is not taken, then the concurrency could result in data inconsistencies, conflicts, and other problems. For example, a “race condition” can result in incorrect balances when several transactions modify the same bank accounts simultaneously.
There are many examples of EDA applications that may be used in conjunctions with concurrent algorithms but may also face potential race conditions. For printed circuit boards (PCBs), packages and integrated circuits (ICs), an autorouter is typically used to find an initial wiring solution that involves many connections. Once one solution has been found, there are many algorithms to improve the solution, e.g., via reduction, crosstalk reduction, via doubling, track centering, etc., which involve modifying many objects. These solutions may be approached in a concurrent manner.
There are two standard approaches that can be taken to prevent race conditions. The first approach is by locking. The second approach is by partitioning.
Most techniques for preventing race conditions use a locking mechanism to determine the order of execution. When an application needs to modify several data objects, it first acquires the exclusive right to that data. This operation is typically called a “lock”. The “lock” is also known as “mutex” (mutual exclusion lock), “semaphore”, or “monitor”. The simplest concurrency algorithm is to acquire locks to all accessed objects prior to reading/modifying the objects and then release those locks after reading/modifying the objects. In the classic example cited above of moving funds between two banking accounts, the algorithm would acquire locks on all affected accounts prior to modifying any of them. All other transactions accessing any of those accounts will be “blocked” until the earlier algorithm completes and releases the locks on those accounts.
The other common approach to concurrency is partitioning. With this approach, the entire data set is divided into two or more partitions. Execution of the algorithm then proceeds in parallel on each partition. Finally, the results from execution on each partition are merged back into a single solution. For example, in a typical IC autorouter, the entire design is partitioned into regions, sometimes called “cells”. A global router first finds a global solution ignoring the details within regions. Then, several copies of a detail router simultaneously find detailed solutions for each region. These regions can be safely executed in parallel since they have no shared data between them.
Locking typically results in non-deterministic results. That is, running the same application on the same data might compute different but equally valid solutions.
In the banking example, if two transactions attempt to withdraw money from the same account, the second transaction might find insufficient funds and reject the transaction. While this operation is safe from a database integrity point-of-view, it means that two different but equally valid outcomes are possible: (1) transaction A succeeds and transaction B fails, or (2) transaction B succeeds and transaction A fails. When thousands of transactions are processed in unpredictable order, it is impractical to predetermine all possible outcomes.
In chaotic systems where small changes in early computations can produce large changes in later computations, non-deterministic results make it impossible to test the system using known good results, e.g., “golden” regression data. It is difficult to develop and debug a system that cannot be tested using known good results.
When partitioning is used, the problem with deterministic results is eliminated. Since all operations only affect their partition, it can be guaranteed that the results of the entire process are the same regardless of the order in which the partitions are reassembled. However, this mechanism can only be used when an isolated partitioning is possible. A partitioning is isolated if and only if each transaction affects only one partition. When the partitions overlap, no such guarantee is possible. In the crosstalk reduction example, if changes in one region affect crosstalk in adjacent regions, it is not possible to define an isolated partitioning.