This application is related to our copending patent applications assigned to the assignee hereof:
xe2x80x9cGATE CLOSE FAILURE NOTIFICATION FOR FAIR GATING IN A NONUNIFORM MEMORY ARCHITECTURE DATA PROCESSING SYSTEMxe2x80x9d by William A. Shelly et al., filed Sep. 30, 1999, with Ser. No. 09/409,456; and
xe2x80x9cGATE CLOSE BALKING FOR FAIR GATING IN A NONUNIFORM MEMORY ARCHITECTURE DATA PROCESSING SYSTEMxe2x80x9d by David A. Egolf et al., filed Sep. 30, 1999, with Ser. No. 09/409,811.
1. Field of the Invention
The present invention generally relates to data processing systems, and more specifically to techniques to detect changes by one processor to memory by another processor.
2. Background of the Invention
Data processing systems invariably require that resources be shared among different processes, activities, or tasks in the case of multiprogrammed systems and among different processors in the case of multiprocessor systems. Such sharing is often not obvious within user programs. However, it is a necessity in operating systems, and is quite common in utility programs such as database and communications managers. For example, a dispatch queue is typically shared among multiple processors in a multiprocessor system. This provides a mechanism that allows each processor to select the highest priority task in the dispatch queue to execute. Numerous other operating systems tables are typically shared among different processes, activities, tasks, and processors
Serialization of access to shared resources in a multiprocessor system is controlled through mutual exclusion. This is typically implemented utilizing some sort of hardware gating or semaphores. Gating works by having a process, activity, or task xe2x80x9cclosexe2x80x9d or xe2x80x9clockxe2x80x9d a xe2x80x9cgatexe2x80x9d or xe2x80x9clockxe2x80x9d before accessing the shared resource. Then, the xe2x80x9cgatexe2x80x9d or xe2x80x9clockxe2x80x9d is xe2x80x9copenedxe2x80x9d or xe2x80x9cunlockedxe2x80x9d after the process, activity, or task is done accessing the shared resource. Both the gate closing and opening are typically atomic memory operations on multiprocessor systems.
There are typically two different types of gates: queued gates and spin gates. Semaphores are examples of queued gates. When a process, activity, or task attempts to xe2x80x9cclosexe2x80x9d a queued gate that is already closed, that process, activity, or task is placed on a queue for that gate, and is dequeued and activated when the gate is subsequently opened by some other process, activity, or task. Queued gates are typically found in situations where the exclusive resource time is quite lengthy, especially in comparison with the time required to dispatch another process, activity, or task.
The second type of gate is a xe2x80x9cspinxe2x80x9d gate. When a process, activity, or task attempts to xe2x80x9cclosexe2x80x9d a spin gate that is already closed, a tight loop is entered where the processor attempting to close the spin gate keeps executing the xe2x80x9cclosexe2x80x9d instruction until it ultimately is opened by another processor or the processor decides to quite trying. Note that xe2x80x9cspinxe2x80x9d gates assume a multiprocessor system since the processor xe2x80x9cspinningxe2x80x9d trying to xe2x80x9cclosexe2x80x9d the spin gate is depending on another processor to xe2x80x9copenxe2x80x9d the gate. Spin gates are typically found in situations where the exclusive resource time is fairly short, especially in comparison with the time required to dispatch another process, activity, or task. They are especially prevalent in time critical situations.
As noted above, the instructions utilized to open and close gates, in particular spin gates, typically execute utilizing atomic memory operations. Such atomic memory modification instructions are found in most every architecture supporting multiple processors, especially when the processors share memory. Some architectures utilize compare-and-swap or compare-and-exchange instructions (see FIGS. 10 and 11) to xe2x80x9cclosexe2x80x9d gates. The Unisys 1100/2200 series of computers utilizes Test Set and Skip (TSS) and Test Clear and Skip (TCS) to close and open spin gates.
The GCOS(copyright) 8 architecture produced by the assignee herein utilizes a Set Zero and Negative Indicators and Clear (SZNC) instruction to xe2x80x9cclosexe2x80x9d a spin gate and a Store Instruction Counter plus 2 (STC2) instruction to subsequently xe2x80x9copenxe2x80x9d the spin gate. The SZNC sets the Zero and Negative indicators based on the current value of the gate being xe2x80x9cclosedxe2x80x9d. It then clears (or zeros) the gate. The next instruction executed is typically a branch instruction that repeats executing the SZNC instruction if the gate being closed was already clear (or contained zero). Thus, the SZNC instruction will be executed repeatedly as long as the spin gate is closed, as indicated by having a zero value. The gate is opened by another processor by storing some non-zero value in the gate cell. In the GCOS 8 architecture, execution of the STC2 instruction to xe2x80x9copenxe2x80x9d a gate guarantees that the xe2x80x9copenedxe2x80x9d gate will contain a non-zero value.
One problem that occurs whenever resources are shared between and among processors is that of cache ownership of directly shared data, including locks.
A cache siphon is where the cache copy of a block of memory is moved from one cache memory to another. When more than one processor is trying to get write access to the same word or block of memory containing a gate at the same time to close the gate, the block of memory can xe2x80x9cping pongxe2x80x9d back and forth between the processors as each processor siphons the block of memory containing the gate into its own cache memory in order to try to close the gate.
Another problem that arises when directly sharing resources is that in the typical processor architecture, processors repeatedly attempt to close gates or otherwise modify directly shared data until that processor can change that shared data as required. For example, in the case of gates, one processor will bang on the gate until it is opened by another processor.
At first glance this may not seem like a problem since the processor xe2x80x9cbangingxe2x80x9d at a lock cannot do anything else anyway until it succeeds in getting the gate locked. However, this constant xe2x80x9cbangingxe2x80x9d on the gate does introduce significant overhead in bus and cache traffic. It would thus be advantageous to reduce this bus and cache traffic when one processor is waiting for another processor to modify a shared location in memory.