In order to meet the increasing needs in terms of application performance, the number of computation resources (processors or processor cores) in parallel architectures is constantly increasing. This raises both the need for and the issue of effectively programming this type of architecture to best benefit from the available computation power.
When running a parallel application, it may be that two or more tasks have to exchange so-called shared data. To guarantee a coherence of the system all the time the application is running, access to the memory system has to be protected. For that, the programmer declares regions in the code of the application, called critical sections, which guarantee an exclusive access to the memory system for any task which obtains the right therefor.
The shared memory model, predominant these days, is based on the use of synchronization primitives based on locks (or any variant likened to a lock) to protect the access to the shared data within the critical sections. However, these primitives execute upscaling with difficulty. This limitation increases the complexity of the programming of the parallel applications and requires a significant investment in time to achieve an acceptable performance level. Furthermore, placing a lock at the start of a critical section guarantees only the exclusive access thereto, and not to the actual shared data. Consequently, the use of a lock does not guarantee the effective protection of the shared data between the tasks but only the exclusive access to the sequence of instructions which uses them. The responsibility for delimiting the critical sections is left to the programmer which is a source of significant errors.
In order to best exploit the computation power present in the massively parallel modern architectures, a more promising approach involves the use of transactional memories. A transactional memory transforms each access to the memory system into a “transaction” which has the following properties: atomicity, coherence, isolation and durability (hence the acronym “A.C.I.D.”).                “Atomicity” means that the simultaneous execution of several transactions must give the same result as the successive execution thereof.        “Coherence” means that a transaction must bring the system from one valid state to another valid state.        “Isolation” means that the updates of shared data used by the transaction are propagated to the rest of the system only once the transaction is finished and therefore validated.        “Durability” means that the updates, once propagated, can no longer be canceled.        
The concept of transactional memory was introduced in 1993 through the paper by M. Herliy and J. E. B. Moss “Transactional Memory: Architectural Support for Lock-Free Data”, 20th Annual Symposium on Computer Architecture, pages 289-300. This paper discloses in particular a hardware device for implementing a transactional memory, based on an associative cache memory. A notable drawback of this solution is that it is a blocking solution.
The paper by Nir Shavit and Dan Touitou “Software Transactional Memory” Proceedings of the 14th ACM Symposium on Principles of Distributed Computing, pages 204-213 has proposed a purely software and non-blocking realization of a transactional memory.
These transactional memories known from the prior art are of “speculative” type. That means that a transaction is initiated by making the assumption that it will not lead to a conflict of access to the shared memory; if such a conflict is detected during execution, the transaction is canceled without leaving traces (to observe the property of isolation). In a speculative transactional memory, the means necessary to guarantee the coherence of the system are very costly in terms of memory imprint (memory space necessary to back up the valid state of the system before starting the transaction), of management of the returns on error in case of incorrect speculation, etc. Thus, these means are unsuited to fields like embedded systems. More generally, they needlessly consume resources which could be allocated to computation tasks.
The realization, by software, of non-speculative transactional memories, in which any conflicts are detected before the start of execution of a transaction, is also known. The use of non-speculative transactional memories makes it possible to reduce the memory imprint and increase the energy efficiency of the system by comparison to the speculative approach. However, because of the absence of speculation, all the data accessed by a transaction must be reserved in one go (that is to say atomically) before the first operation of the transaction is executed, this being so as to guarantee the absence of deadlocks in the reservation; the absence of deadlocks in the reservation guaranteeing the “A.C.I.D.” properties. Furthermore, in the general case, a transaction can access an arbitrary number of data which presupposes the possibility of atomically reserving an arbitrary and variable number of data. Now, the atomic reservation of several data in an intrinsically parallel system such as a multiprocessor computation architecture is a non-trivial problem to be resolved. The reservation comprises two major parts: the declaration of the data set to be reserved and the detection of conflicts between this set and any other data set already reserved by one or more other transactions. A naïve and simplistic realization would consist in using a global lock which stops the execution of the entire system for the time of the reservation, which creates a total order between the reservations of a same datum and prevents the occurrence of a deadlock; however, this solution would lead to an unacceptable degradation of performance; in particular, it takes no account of the case where two transactions reserve two totally separate data sets.
Document U.S. Pat. No. 5,742,785 describes a multiple data reservation mechanism via dedicated hardware registers, associated with each computation unit. The mechanism makes it possible to check that data are reserved and to then proceed to write them atomically in memory. Nevertheless, the reservation of the data is not, in itself, atomic. In effect, for a starting data set, the reservation of certain data can fail and, if appropriate, update a validity flag linked to each non-reserved datum. The absence of atomic reservation for a data set does not satisfy the conditions necessary and sufficient to realize a non-speculative transactional memory.
Document WO 2013/147898 discloses a multi-core processor comprising a hardware tracing device for recording interactions between threads having access to a shared memory. This hardware tracing device uses, for each processor, two non-counting Bloom filters for separately storing the read and write accesses to the shared memory of a set of accesses. They are intended to identify the memory access conflicts on the reception of coherence messages from the other cores. The conflict is characterized by the addresses affected by the coherence message belonging to the two Bloom filters.
Document US 2009/0183159 discloses a method for managing concurrent transactions implemented by computer and using software Bloom filters. Since the detection of conflicts is performed by comparing the transactions two by two, this approach risks leading to considerable slowdowns if the number of transactions is high.
Document US 2009/0133032 discloses a data processing method and apparatus using a plurality of processors and implementing a transactional memory. By taking into account access conflicts detected in the past it becomes possible, through a kind of learning, to minimize the risks of collision between the future transactions.
The paper by Chi Cao Minh et. al. “An Effective Hybrid Transactional Memory System with Strong Isolation Guarantees”, SIGARCH Computer Architecture News 35, 2 (June 2007), pp. 69-80 describes a mechanism which makes it possible to accelerate the search for conflicts in the context of speculative transactional memories. The acceleration is provided by the presence of hardware Bloom filters of conventional (non-counting) type in support of a software transactional system. This hardware accelerator does not make it possible to reserve multiple data atomically.