Writing computer code generally involves a debugging stage before the code can be successfully executed on hardware. Software debuggers may be utilized to assist in identifying portions of the code that cause catastrophic failures and/or generate inappropriate results. When debugging sequential code, once the code reaches a breakpoint (also referred to as a breakpoint instruction), the code execution is suspended. At this point, a software engineer may examine various information regarding the execution of the code, including contents of memory, register files, or other variables or states. The debugger generally expects the examined information to correctly reflect the code execution states just before the breakpoint. The programmer may then utilize this information to determine what changes are to be made to the code to address any existing issues.
When debugging computer code that runs in parallel (e.g., multithreaded application programs) on multiple processing elements (e.g., processor cores), however, specialized hardware may have to be utilized. Examples of the specialized hardware include an in-circuit emulator (ICE) and a Joint Test Access Group (JTAG) port. Utilization of such hardware, however, increases the manufacturing costs of processors because additional circuitry is included on each processing element. The additional circuitry may also reduce the footprint available to include other functionality on the processor. Furthermore, a debugger may need to be knowledgeable about both software debuggers and the specialized hardware to effectively debug the parallel code. Finally, the software engineer needs access to debugging hardware beyond the processor, adding to the cost of software development.
Additionally, current breakpoint support for multithreaded application debugging either does not stop the other threads (besides the breakpoint thread that is executing the breakpoint instruction), especially if those other threads are running on different processing elements than the breakpoint thread, or else uses underlying inter-thread communication (ITC) for breakpoint event propagation from the breakpoint thread to other threads running on other processing elements. Since the ITC mechanism is implemented primarily in software, the breakpoint event propagation may incur a relatively long delay compared to the thread execution. Thus, when a non-breakpoint thread is notified of the breakpoint event, it may already be context-switched multiple times and hence the thread states may be quite different from when the breakpoint is reached.