In a shared address space architecture, all parts of memory are accessible to all threads, processes, and/or processors. As a result, shared address space programming paradigms concentrate on methods for expressing concurrency and synchronization of data accesses. A significant aspect of creating correctly threaded programs involves synchronizing data accesses between concurrent threads and/or processes.
Most compilers provide explicit support for critical sections (e.g., a section of program instructions that must be executed atomically (i.e., as a whole) such as, for example, accessing a shared address space) using mutual exclusion locks (e.g., mutexes). A virtual mutex is a data structure within a program that is used to control access to shared data. Virtual mutexes are managed (e.g., locked, unlocked, etc.) using software instructions embedded within a program. During program execution, a thread must lock a virtual mutex before the thread can access shared data protected by the virtual mutex. If the virtual mutex is locked by a different thread, the thread is blocked from accessing the shared data and must wait for the virtual mutex to be unlocked before the thread can lock the mutex and enter the critical section. The virtual mutex must be unlocked when a thread leaves the critical section to enable other threads to access the shared data. As a result, critical sections have a serialization effect on threads that access the same shared data.
Parallel multi-threaded architectures provide resources to implement physical mutexes. Physical mutexes function in the same manner as virtual mutexes but are associated with hardware to implement the synchronization of data accesses rather than using a software data structure. For example, the Intel® Internet Exchange Architecture (IXA) family of network processors provides 15 signals for synchronizing of data accesses within one microengine (e.g., a form of microprocessor in the Intel® IXA family of network processors) and/or between two different microengines. These 15 signals may be used as physical mutexes.
A compiler that is configured to generate programs for these parallel multi-threaded processors must be configured to allocate these physical mutexes to the virtual mutexes used in the program. Due to the limited number of physical mutexes and potentially larger number of virtual mutexes used in the program, the compiler may attempt to merge critical sections until the number of physical mutexes equals the number of critical sections. However, it is important to minimize the size of the merged critical sections to prevent significant performance degradation.