The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms that provide support for debugging multithreaded code.
Writing computer programs to run in a multitude of threads is a recognized method in the current state of the art to improve application performance. Unlike single-threaded applications, which execute instructions sequentially according to program order, multithreaded applications improve performance by running multiple threads simultaneously on various processing components of a system. Performance improves because more than one processor or hardware thread are typically running the multithreaded code, thereby helping the application complete its tasks in shorter time.
The development of multithreaded applications remains a difficult task, however, because the programmer often has to insert synchronization code to make the threads behave in a desired manner to compute the equivalent result of the application running as a sequential program. Such synchronization code can be difficult to write and maintain. Another difficulty in developing multithreaded application code is to organize the sharing of data among the threads. Without careful organization of how threads share data among themselves, the threads within an application may overwrite each other's changes to data items in memory, or may produce unpredictable results because reads and writes of the same data item are not ordered properly. This condition is usually called a “data race” or simply a “race condition.”
Many synchronization primitives have been invented to aid programmers in developing multithreaded applications. For example, semaphores, locks, and monitors are generally recognized techniques to impose order on shared data access and to ensure that threads interact with one another in a predictable manner. When a correctly written parallel program uses these constructs, it will generally produce correct results and behave in a deterministic manner. However, even with these constructs and primitives, the task of developing multithreaded code is not a simple one. A programmer may forget to protect access to a shared data item by failing to introduce the proper synchronization code. Such unprotected accesses are called demonic accesses, and are very difficult to track at runtime.
Since no application code can be realistically assumed to be correct upon implementation, a debugging and testing phase usually follows code development. During this phase, the application runs a test suite (usually called regression testing) and the results are examined to see if the application can be released. If the results show errors in the application code, it is debugged by several techniques such as relating the errors back to their origins until the source of error has been identified and corrected. This technique, already difficult in sequential debugging, is even more difficult to use in multithreaded code because the application code is often not deterministic. For example, if there is a demonic access of shared data, a run of an application may have different possible schedules for the demonic access, and some of these schedules may not produce an error at all. Thus, repeating the execution of the application to find bugs is not a viable approach in debugging multithreaded code.
To exacerbate the problem, there is a dearth of tools that can help in debugging multithreaded applications. Unlike sequential code where the programmer can use tools to observe the behavior of the code as it runs through the different phases of a program, a parallel program may not execute in the same manner every time. Thus, there will be situations where a bug manifests itself some of the time, or worse yet, a bug may manifest itself rarely, making it difficult to uncover. Furthermore, many of the conventional techniques for sequential debugging may perturb the timing of a parallel program so as to mask the appearance of bugs while the debugging session is on, only to appear later when the debugging tools have been disengaged.