The present invention relates to a hardware mechanism for performing thread-level speculative parallel execution.
In real time transaction applications, response time is one of the most important indicators to customers. However, response times depend largely on the single thread performance of the processor. In recent years, however, the growth rate of single-thread performance is slowing down.
Thread-level speculative parallel execution is one well-known response. Thread-level speculative parallel execution speeds up the execution of single-thread programs by allowing a compiler or programmer to speculatively parallelize a single-thread program. This typically requires a complicated hardware mechanism.
Hardware transaction memory is one technique used to speed up execution. In hardware transaction memory, a transaction is a sequence of instructions between special instructions such as transaction begins and transaction end. When a data access conflict occurs between two transactions being executed in parallel such as read-after-write, write-after-write, and read-after-read conflicts, the hardware cancels the execution of the transaction.
However, thread-level speculative parallel execution cannot be performed by the hardware transaction memory alone. Thread-level speculative parallel execution requires the completion of transactions in order. However, the runtime for controlling the completion order causes transaction conflicts.
The following prior art technologies are known to be related to this.
Laid-open Patent Publication No. 2009-521767 describes software transaction memory (STM) access processing which is executed when the preceding hardware transaction memory (HTM) access processing fails.
Laid-open Patent Publication No. 2010-532053 describes the use of transaction memory hardware to facilitate the updating of a dispatch table in a multi-thread environment utilizing an atomic commit function. Here, an emulator uses a dispatch table stored in the main memory to convert a guest program counter to a host program counter.
PCT Publication No. WO2010/001736 describes a multi-processor system including a plurality of processors for executing multi-threads in the processing of data, and a data processing control unit for determining satisfactory conditions allowing the processors to execute the threads in order, and for starting the execution of each thread so as to satisfy these conditions.
U.S. Pat. No. 8,151,252 describes the speculative parallelization of a program using transactional memory by scoping program variables during compilation, and by inserting code into the program during compilation. In this technique, the scoping is determined based on whether a scalar variable being scoped is involved in inter-loop non-reduction data dependencies, whether the scalar variable is used outside the loop defining it, and at what point in a loop the scalar variable is defined.
Architecture based on thread-level speculation is presented in Jeffrey Thomas Oplinger,” Enhancing Software Reliability with Speculative Threads”, Graduate Studies of Stanford University, August 2004. A programmer can use this to add monitoring code for checking the execution of a program. This architecture mitigates speed reductions when the monitoring code is executed speculatively in parallel with the main computations. In order to recover from an error, the programmer can define transactions with fine granularity. Side effects of these transactions are committed or aborted via program control. These transactions are implemented efficiently via thread-level hardware support.
A hybrid conflict management mechanism is presented in Ruben Titos, Manuel E. Acacio, Jose M. Garcia,” Speculation-Based Conflict Resolution in Hardware Transaction Memory”, Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on 23-29 May 2009. In hardware transaction memory, this hybrid conflict management mechanism uses a mechanism with an enthusiastic policy as the base, but combines the advantages of an enthusiastic policy with a lazy policy to allow many conflict-prone transactions to coexist.