A data race is a type of problem that may occur in multi-threaded programs or multiple programs accessing the same data which may lead to anomalous behavior of the program(s). Data races may occur where a shared variable can be accessed by various threads/programs simultaneously. Threads/programs “race” to access a shared variable and, depending upon which access occurs first, program results may vary unpredictably. Conventional solutions to this problem attempt to detect data races before they occur. This is partially due to the fact that data races are unpredictable and thus extremely difficult to reproduce during the debugging process. Indeed, any anomalous behavior caused by a data race is dependent on the precise timing of separate threads/programs accessing the same memory location and may thus disappear if that timing is altered during the debugging process.
Conventional solutions for data race detection monitor lock acquisition and memory accesses, computing an access pattern for each memory location and memory access. These solutions then evaluate the access pattern to memory locations to detect suspicious access patterns that may indicate a potential data race. An access pattern is “suspicious” if a memory location is shared among multiple threads without a common lock that may be used by individual threads/programs to govern access to the memory locations. Locks may be used to prevent data races from occurring where suspicious activity is detected.
A lock is a software construct that enables at most one thread/program to access a shared variable at a certain point in time. A locking discipline (i.e., a way of using of a lock) may require that a lock for a shared variable must be acquired before accessing the shared variable. Once a thread/program has completed its access to the shared variable, the lock is released. Locks are “acquired and released,” enabling only one thread to access a particular shared variable at any given time. Locks and locking disciplines typically follow an access pattern.
FIG. 1 illustrates a conventional access pattern state diagram. Here, a series of states 102–112 and “superstates” 114–116 are described to illustrate conventional techniques for detecting potential data races. “Exclusive” describes those states where only one thread/program may access a variable at any given time. “Shared” refers to variables that may be accessed simultaneously by multiple threads/programs, unless one of the threads/programs is performing a write operation, which indicates a suspicious pattern (i.e., a potential data race). States 102–112 represent a particular state of an item during an access. Each item is initially in a “virgin” state 102, then moves to an exclusive first state 104 when a thread in a multi-threaded program (or a program) first accesses the item. When a second thread/program accesses the item (previously accessed by the first thread/program), the item moves to an exclusive second state 106. The separation of exclusive superstate 114 into an exclusive first state 104 and an exclusive second state 106 prevents generation of a false alarm. If a program is designed to allow a first thread/program to initialize an object, handing it over to a second thread/program without ever performing any simultaneous shared access, a false alarm indicating a potential data race may be generated.
When a different thread accesses an item in exclusive second state 106, the item moves to shared superstate 116. If the access is a read operation (“read”), then the item enters shared read state 108. In the event that the access is a write operation (“write”), the item enters shared modify state 110. This is an example of a “first shared” access. Subsequent accesses are also referred to as “shared” accesses. Also, if the shared access is a write and the item is in shared read state 108, the item moves to shared modify state 110. Entering a shared state (e.g., shared read state 108 or shared modify state 110) also initiates computation of a set of locks (“lockset”) that are common to shared accesses to an item. The first lockset is set to the set of locks held by the accessing thread when the first shared access occurs. On every subsequent shared access, the item's lockset is reduced to the intersection of its lockset and the set of locks held by the accessing thread.
An access pattern's lockset can only decrease over time, as subsequent accesses occur. However, a shared modify access pattern with an empty lockset indicates a suspicious pattern. When a suspicious access pattern is first detected, conventional implementations generate a warning (e.g., warning state 112) of a potential data race. Typically, when a warning of a potential data race is generated, the stack of the thread associated with the suspicious pattern is dumped, enabling a user to diagnose a copy of the thread whether a potential data race exists while still permitting the program to run. A “warning” state 112 is entered if suspicious patterns are detected.
FIG. 2 illustrates conventional encoding of access patterns. As an example, conventional techniques encode information relevant to access patterns using 32-bit words that include state information. In each state, except virgin state 102, information in addition to the state name must be stored. In an exclusive state (e.g., exclusive states 104–106), an identifier for a thread exercising exclusive access is stored. In a shared state (e.g., shared states 108–110) a set of common locks is stored. In order to store an access pattern in one word, typically a few bits (e.g., bits 202–210) are used to encode the state name. Fields 212–220 are used to store remaining bits for a thread identifier or an index in a table of locksets.
One problem with conventional solutions is that a program's memory requirements are significantly increased. For each memory location used by a program to store data, another memory location is required to store an access pattern. Other conventional solutions for locking discipline combine the accesses to all memory locations in an object and require only one additional memory location to store the access pattern for an entire object. However, these solutions work only if every memory location (e.g., all elements in an array) follows the same locking discipline, which is often not the case. Another problem with conventional locking disciplines is that even if an object access pattern is stored, applying it to individual memory locations is a time consuming and labor intensive process and is typically applied only during the debugging process.
Thus, what are needed are systems and methods for refining the detection of potential data races without the limitations of conventional techniques.