The present technique relates to an apparatus and method for performing atomic update operations. When processing circuitry issues an atomic update operation specifying a memory address, this will typically require the data at that memory address to be obtained, some computation to be performed using that obtained data, and then a data value to be written back to the specified memory address dependent on the outcome of that computation. This sequence of steps needs to be performed atomically so that the data is not accessed by another operation whilst the update operation is being performed.
Many modern day data processing systems include one or more levels of cache between the processing circuits and memory, in which cached copies of the data at certain memory addresses can be retained to improve speed of access to that data by associated processing circuitry. One or more levels of cache may be provided for the exclusive use of an associated processing circuit, such caches often being referred to as local caches, whilst other levels of cache may be shared between multiple processing circuits, often being referred to as shared cache.
Considering the earlier mentioned atomic update operations, when it is determined that the specified address relates to data that has been cached in a local cache, it may be possible for that atomic update operation to be performed using the local cache contents, in such a situation the atomic update operation being referred to as a near atomic operation. However, before the near atomic operation can be performed, certain pending cache access operations may need to be completed, and this can give rise to a performance impact in the handling of the atomic update operation. It would be desirable to provide a mechanism for alleviating this performance impact.