A shared memory system includes a plurality of addresses that are accessed by multiple agents. For example, a shared memory system could include an L1/L2 cache and a common memory store. In software, there are many instances in which two or more threads share a block of data. There are times where a thread must do a set of operations on one or more addresses without interference from another thread (e.g. read a value, increment it, store a new value). Typically, software will use locks to protect a set of addresses from being accessed by other threads while one thread is accessing those addresses. This locking creates a serialization point for software. However, many times this locking is done at a much coarser grain or more strictly than is actually needed. It would be desirable to eliminate the need for software locks in such cases and improve the efficiency of the shared memory system.