Advances in semi-conductor processing and logic design have permitted an increase in the amount of logic that may be present on integrated circuit devices. As a result, computer system configurations have evolved from a single or multiple integrated circuits in a system to multiple cores and multiple logical processors present on individual integrated circuits. A processor or integrated circuit typically comprises a single processor die, where the processor die may include any number of cores or logical processors.
The ever increasing number of cores and logical processors on integrated circuits enables more software threads to be concurrently executed. However, the increase in the number of software threads that may be executed simultaneously have created problems with synchronizing data shared among the software threads. One common solution to accessing shared data in multiple core or multiple logical processor systems comprises the use of locks to guarantee mutual exclusion across multiple accesses to shared data. However, the ever increasing ability to execute multiple software threads potentially results in false contention and a serialization of execution.
For example, consider a hash table holding shared data. With a lock system, a programmer may lock the entire hash table, allowing one thread to access the entire hash table. However, throughput and performance of other threads is potentially adversely affected, as they are unable to access any entries in the hash table, until the lock is released. Alternatively, each entry in the hash table may be locked. Either way, after extrapolating this simple example into a large scalable program, it is apparent that the complexity of lock contention, serialization, fine-grain synchronization, and deadlock avoidance become extremely cumbersome burdens for programmers.
Another recent data synchronization technique includes the use of transactional memory (TM). Often transactional execution includes executing a grouping of a plurality of micro-operations, operations, or instructions. In the example above, both threads execute within the hash table, and their memory accesses are monitored/tracked. If both threads access/alter the same entry, conflict resolution may be performed to ensure data validity. One type of transactional execution includes a Software Transactional Memory (STM), where tracking of memory accesses, conflict resolution, abort tasks, and other transactional tasks are performed in software.
In strongly atomic transactional memory systems, to ensure runtime conflicts between transactional memory operations and non-transactional memory operations do not occur, compilers treat each non-transactional memory operation as a single operation transaction. In other words, transactional barriers are inserted at non-transactional memory accesses to isolate transactions from these non-transactional memory accesses. Here, the potential incorrect execution due to conflicts between transactional and non-transactional accesses is avoided; yet, execution of transactional barriers at every non-transactional memory operation potentially wastes execution cycles.
In contrast, in weakly atomic transactional memory systems, only transactional accesses are isolated from each other. In such systems non-transactional memory accesses are not tracked and, thus, do not incur any additional transactional overhead. However, weakly atomic systems do not provide general isolation and ordering guarantees for programs that mix transactional and non-transactional accesses to the same data which may potentially lead, in some cases, to incorrect execution. As an example, pseudo code A is included below to illustrate potential problems created without isolation between accesses.
Pseudo Code A: Privatization example.Initially: item != NULL, item->data = 0Thread 1Thread 2atomic { atomic { p = item;  if (item != NULL) item = NULL;   item->data = 1;}}r1 = p->data; // R1r2 = p->data; // R2    Can r1 != r2?
In an in-place-update STM, Thread 2 potentially reads item and sets data to 1 before Thread 1 sets item to NULL. Yet, this conflict may be detected and aborted after read R1. As a result, read R1 can see a dirty value of data (r1==1), while read R2 sees the correct value (r2==0) after transaction in Thread 2 aborts. Similarly, in a write-buffering STM, the transaction in Thread 2 might validate before the write to item in Thread 1 but copy the data from a write buffer into main memory after Thread 1 executes read R1. As a result, only read R2 would see the new value (r1==0 and r2==1).
Other examples of unsafe behaviors include overwriting a private write with a transactional write and inconsistent transactional execution due to interference from a private write. Usually, weakly atomic systems are augmented with additional mechanisms that enforce proper ordering between transactions and non-transactional memory accesses to ensure these unsafe behaviors do not occur. However, these mechanisms are often not efficiently applied, which potentially creates extra execution overhead.