Advances in semi-conductor processing and logic design have permitted an increase in the amount of logic that may be present on integrated circuit devices. As a result, computer system configurations have evolved from a single or multiple integrated circuits in a system to multiple cores and multiple logical processors present on individual integrated circuits. An integrated circuit typically comprises a single processor die, where the processor die may include any number of cores or logical processors.
As an example, a single integrated circuit may have one or multiple cores. The term core usually refers to the ability of logic on an integrated circuit to maintain an independent architecture state, where each independent architecture state is associated with at least some dedicated execution resources. As another example, a single integrated circuit or a single core may have multiple logical processors for executing multiple software threads, which is also referred to as a multi-threading integrated circuit or a multi-threading core. Multiple logical processors usually share common data caches, instruction caches, execution units, branch predictors, control logic, bus interfaces, and other processor resources, while maintaining a unique architecture state for each logical processor
The ever increasing number of cores and logical processors on integrated circuits enables more software threads to be executed. However, the increase in the number of software threads that may be executed simultaneously has created problems with synchronizing data shared among the software threads. One common solution to accessing shared data in multiple core or multiple logical processor systems comprises the use of locks to guarantee mutual exclusion across multiple accesses to shared data. However, the ever increasing ability to execute multiple software threads potentially results in false contention and a serialization of execution.
Another solution is using transactional execution to access shared memory to execute instructions and operate on data. Often transactional execution includes speculatively executing a grouping of a plurality of micro-operations, operations, or instructions. During speculative execution of a transaction by a processor, core, or thread, the memory locations read from and written to are tracked to see if another processor, core, or thread accesses those locations. If another thread does alter those locations, the transaction is restarted and it is re-executed from the beginning. Currently, values of memory locations to be changed in a transaction are saved elsewhere, so if the transaction needs to be re-executed the original state of all memory/registers may be restored.
However, as transactional execution has progressed, software programmers have begun to use nested transactions, i.e. a grouping of instructions/operations to be executed within and part of another outer/enclosing group of instructions/operations. As a consequence, current hardware for support for nested transactions has resulted in inefficient execution of nested transactions.
For example, assume an outer transaction and an inner transaction nested within the outer transaction is to be executed. Current hardware support typically saves values of memory locations to be changed before entering the outer transaction. Yet, when executing in the inner transaction, if an abort or invalidating event occurs, the state of memory locations is usually rolled-back to the original state of the memory locations requiring a re-execution of both the outer and inner transaction. This simple example is magnified where more nested transactions exist within each other. Specifically, if an abort occurs within a nested transaction deep in a hierarchy of transactions, numerous nested transactions that were not associated with the abort would have to be re-executed for no reason.