Conventional computer systems may include multiple processors that operate in parallel to execute multiple different programs or execution threads. In a typical computer system multiple programs may run on different processors that share memory, input/output (I/O), and other subsystems. Programs need to share certain resources, such as data, in a manner that allows only one program or task (i.e., a segment of a program) to do so at a time. For example, a first task might produce a buffer of data, which may then be consumed by a second task that is waiting on that particular data as a prerequisite. In such an example, proper sharing requires that the first task obtains exclusive access to the buffer during the time in which the first task is filling or emptying the buffer. A task may use a memory location, sometimes referred to as a semaphore, to signal to other tasks that a corresponding shared resource is not available. In a conventional semaphore-based system, all tasks check a semaphore before accessing the shared resource that corresponds to that particular semaphore. Semaphores thereby ensure mutual exclusion by helping the system track which task is currently using a given resource.
General purpose software (e.g., operating systems and applications) typically relies on semaphores for memory resources. Conventional processors may support semaphores by providing atomic instructions and/or a coherent memory hierarchy. Atomic instructions allow multiple tasks, which may be executing in parallel on different processors, to attempt to set a given semaphore simultaneously but only allow one of the tasks to actually succeed in setting the semaphore. The task that succeeds may then use the corresponding resource guarded by the semaphore while the other tasks wait. A coherent memory hierarchy ensures that there may only be a single copy of the semaphore when it is written.
In many embedded systems, the memory hierarchy may not be coherent, and software may handle coherency. Mutual exclusion with conventional semaphore constructs is difficult when the processors are different and the memory is non-coherent. Embedded systems may use mailboxes, message passing, and ad-hoc methods to achieve mutual exclusion, however these methods do not scale well. Conventional computer systems may include a mixture of subsystems that are coherent (e.g., intellectual property (IP) cores supplied by external vendors and equipped with coherent caches) and subsystems that are not coherent. In such systems with mixed coherency schemes, it would typically require complex and/or expensive hardware to make the entire system memory hierarchy coherent. As the number of processors in conventional multi-processor computer systems continues to grow, adding complicated and/or expensive hardware to create a coherent memory hierarchy is typically not a scalable solution. Similarly, as the trend to combine both externally-supplied IP cores and other processors continues to increase the costs of integrating the various subsystems in a coherent manner will also rise.