1. Field of the Invention
The present invention relates to techniques for facilitating transactional program execution. More specifically, the present invention relates to a method and an apparatus for reporting reasons for failure during transactional program execution.
2. Related Art
Computer system designers are presently developing mechanisms to support multi-threading within the latest generation of Chip-Multiprocessors (CMPs) as well as more traditional Shared Memory Multiprocessors (SMPs). With proper hardware support, multi-threading can dramatically increase the performance of numerous applications. However, as microprocessor performance continues to increase, the time spent synchronizing between threads (processes) is becoming a large fraction of overall execution time. In fact, as multi-threaded applications begin to use even more threads, this synchronization overhead becomes the dominant factor in limiting application performance.
From a programmer's perspective, synchronization is generally accomplished through the use of locks. A lock is typically acquired before a thread enters a critical section of code, and is released after the thread exits the critical section. If another thread wants to enter a critical section protected by the same lock, it must acquire the same lock. If it is unable to acquire the lock, because a preceding thread has grabbed the lock, the thread must wait until the preceding thread releases the lock. (Note that a lock can be implemented in a number of ways, such as through atomic operations or semaphores.)
Unfortunately, the process of acquiring a lock and the process of releasing a lock are very time-consuming in modem microprocessors. They involve atomic operations, which typically flush the load buffer and store buffer, and can consequently require hundreds, if not thousands, of processor cycles to complete.
Moreover, as multi-threaded applications use more threads, more locks are required. For example, if multiple threads need to access a shared data structure, it is impractical for performance reasons to use a single lock for the entire data structure. Instead, it is preferable to use multiple fine-grained locks to lock small portions of the data structure. This allows multiple threads to operate on different portions of the data structure in parallel. However, it also requires a single thread to acquire and release multiple locks in order to access different portions of the data structure. It also introduces significant software engineering concerns, such as avoiding deadlock.
In some cases, locks are used when they are not required. For example, many applications make use of “thread-safe” library routines that use locks to ensure that they are “thread-safe” for multi-threaded applications. Unfortunately, the overhead involved in acquiring and releasing these locks is still incurred, even when the thread-safe library routines are called by a single-threaded application.
Applications typically use locks to ensure mutual exclusion within critical sections of code. However, in many cases threads will not interfere with each other, even if they are allowed to execute a critical section simultaneously. In these cases, mutual exclusion is used to prevent the unlikely case in which threads actually interfere with each other. Consequently, in these cases, the overhead involved in acquiring and releasing locks is largely wasted.
Computer designers have alleviated this problem by providing mechanisms to “transactionally execute” critical sections of code. If transactional execution of a critical section completes successfully without interference from another process, the system commits changes made during the transactional execution, and resumes normal execution of the program past the critical section. Otherwise, if transactional execution of the critical section fails because of interference from another process, the system discards changes made during the transactional execution. (For example, see U.S. Pat. No. 6,862,664, entitled “Method and Apparatus for Avoiding Locks by Speculatively Executing Critical Sections of Code” by inventors Shailender Chaudhry and Marc Tremblay and Quinn A. Jacobson.)
In a computer system that supports transactional execution, it is desirable to know why an attempted transaction failed. This allows the system to determine whether it is beneficial to re-attempt the transaction, or instead to acquire a lock on the critical section. However, existing transactional memory implementations provide no mechanism for recording this type of failure information.
Hence, what is needed is a method and an apparatus for reporting reasons for failure during transactional program execution.