For years, various software applications have included multiple discrete operations that are executed at the same time. Often these operations need to access and manipulate common data at a single memory and/or disk location. If software is not carefully designed, then multiple operations may try to write to a single data location at the same time. This may be referred to as a race condition. Race conditions often result in corrupt data and can cause software applications to generate incorrect results.
Mechanisms called locks have been implemented to prevent race conditions. According to common locking schemes, operations request a “lock” before accessing a data location. If the data location is available, then the lock is granted and the operation is cleared to access the data location. If another operation is accessing the data location, (e.g., another operation has the lock) then the lock request may be denied. The requesting operation may then either terminate, or wait until the lock becomes available.
Although properly implemented locking schemes may prevent many race conditions, they have the capability to create their own problems. For example, an operation A and an operation B may both need to perform tasks that require access to two data locations, X and Y, at the same time. If A holds the lock for X and B holds the lock for Y, then neither application may be able to perform its task. In that case, A and B may each wait indefinitely for both locks to become available, causing the software application to stop or hang-up. This problem, called deadlock, is commonly avoided by using a lock ranking or lock hierarchy. According to a lock ranking, each concurrently executed operation is required to request locks in a particular order. For example, both A and B could be required to request the lock for X before requesting the lock for Y. Accordingly, the situation where both applications hold one, but not both, of the locks can be avoided.
As with all programming methods, specific implementations of locks and lock ranking systems often include bugs. These bugs can be particularly difficult to debug because their symptoms, race and deadlock conditions, are not deterministic and cannot be easily reproduced. For example, a program having a race or deadlock related defect may run flawlessly four times in a row, and then crash on the fifth execution. Adding to the difficulty of finding and correcting for race and deadlock problems is the fact that they are highly dependent on execution timing. For example, latent race or deadlock related problems in an application developed and tested on a first system type may not manifest themselves until the application is run on a faster system.