1. Field
The described embodiments relate to computing devices. More specifically, the described embodiments relate to a conditional notification mechanism in a computing device.
2. Related Art
Many modern computing devices include two or more entities such as central processing unit (CPU) or graphics processing unit (GPU) cores, hardware thread contexts, etc. In some cases, two or more entities in a computing device communicate with one another to determine if a given event has occurred. For example, a first CPU core may reach a synchronization point at which the first CPU core communicates with a second CPU core to determine if the second CPU core has reached a corresponding synchronization point. Several techniques have been proposed to enable entities in a computing device to communicate with one another to determine if a given event has occurred, as described below.
A first technique for communicating between entities is a “polling” technique for which a first entity, until a value in a shared memory location meets a condition, reads the shared memory location and determines if the shared memory location meets the condition. For this technique, a second (and perhaps third, fourth, etc.) entity updates the shared memory location when a designated event has occurred (e.g., when the second entity has reached a synchronization point). This technique is inefficient in terms of power consumption because the first entity is obligated to fetch and execute instructions for performing the reading and determining operations. Additionally, this technique is inefficient in terms of cache traffic because the reading of the shared memory location can require invalidation of a cached copy of the shared memory location. Moreover, this technique is inefficient because the polling entity is using computational resources that could be used for performing other computational operations.
A second technique for communicating between entities is an interrupt scheme, in which an interrupt is triggered by a first entity in order to communicate with a second (and perhaps third, fourth, etc.) entity. This technique is inefficient because processing interrupts in the computing device requires numerous operations be performed. For example, in some computing devices, it is necessary to flush instructions from one or more pipelines and save state before an interrupt handler can process the interrupt. In addition, in some computing devices, processing an interrupt requires communicating the interrupt to an operating system on the computing device for prioritization and may require invoking scheduling mechanisms (e.g., a thread scheduler, etc.).
A third technique for communicating between entities is the use of instructions such as the MONITOR and MWAIT instructions. For this technique, upon executing a MONITOR instruction, the first entity configures a cache coherency mechanism in the computing device to monitor for updates to a designated memory location. Upon subsequently executing the MWAIT instruction, the first entity signals the coherency mechanism (and the computing device generally) that it is transitioning to a wait (idle) state until an update (e.g., a write) is made to the memory location. When a second entity updates the memory location by writing to the memory location, the coherency mechanism recognizes that the update has occurred and forwards a wake-up signal to the first entity, causing the first entity to exit the idle state. This technique is useful for simple cases where a single update is made to the memory location. However, when a value in the memory location is to meet a condition, the technique is inefficient. For example, assuming that the condition is that the memory location, which starts at a value of 0, is to be greater than 25, and that the second entity increases the value in the memory location by at least one each time an event occurs. In this case, the first entity may be obligated to execute the MONITOR/MWAIT instructions and conditional checking instructions as many as 26 times before the value in the memory location meets the condition.
A fourth technique for communicating between entities employs a user-level interrupt mechanism where a first entity specifies the address of a memory location (“flag”). When a second entity subsequently updates/sets the flag, the first entity is signaled to execute an interrupt handler. For this technique, much of the control for handling the communication between the entities is passed to software and thus to the programmer. Because software is used for handling the communication between the entities, this technique is inefficient and error-prone.
As described above, the various techniques that have been proposed to enable entities to communicate with one another to determine if a given event has occurred are inefficient in one way or another.