Advances in semi-conductor processing and logic design have permitted an increase in the amount of logic that may be present on integrated circuit devices. In fact, computer system configurations have evolved from a single or multiple integrated circuits in a system to multiple cores, multiple hardware threads, and multiple logical processors present on individual integrated circuits. A processor or integrated circuit typically comprises a single physical processor die, where the processor die may include any number of cores, hardware threads, or logical processors. The ever increasing number of processing elements—cores, hardware threads, and logical processors—on integrated circuits enables more tasks to be accomplished in parallel.
As the parallel nature of processors has increased so has the use of speculation (early execution of code; the results of which may not be needed). In conjunction with speculative execution, more parallel execution has resulted in difficulties regarding shared data synchronization. One recent data synchronization technique includes the use of transactional memory (TM). Often transactional execution includes speculatively executing a grouping of a plurality of operations or instructions, where tentative results are not made globally visible until a commit point of a transaction. As an example, multiple threads may execute within a data structure, and their memory accesses are monitored/tracked. If the threads access/alter the same entry, conflict resolution may be performed to ensure data validity. As part of the conflict resolution, a transaction is able to be aborted—returned to the state prior to starting execution of the transaction.
Very similar to transactional execution is a type of execution referred to as Hardware Lock Elision (HLE). Instead of speculatively executing a critical section demarcated by begin and end transaction instruction, HLE speculatively executes a critical section that is demarcated by lock and lock release instructions. Essentially, HLE elides—omits—the lock and lock release instructions and treats the critical section like a transaction. Although separate hardware may be utilized for critical sections detection and prediction, often similar transactional memory structures are utilized to support HLE.
In fact, a Hardware Transactional Memory (HTM) System usually includes hardware structures to support access tracking, conflict resolution, and other transactional tasks. As described above, speculative execution, whether being utilized for transactional execution, HLE, or traditional speculation, typically results in either a commit (use or making speculative results globally viewable) or an abort (discard or return of hardware to a previous state before speculation began). Yet, current hardware to support commit and aborts of speculative critical sections potentially incur significant penalties (a large number of cycles for transitioning from speculative to non-speculative states). For example, in a legacy cache memory to recover pre-speculation states, a legacy read port is utilized to read out speculative values, the values are operated on outside the cache array, then the modified values are written back through a legacy read port. In an illustrative example (during an abort of a critical section), it may take up to 512 cycles to recover the proper, pre-speculation memory state.