The present invention relates in general to data processing systems, and in particular, to the execution of machine context synchronization operations.
Within certain processor architectures, memory synchronization instructions control the order in which memory operations are completed with respect to asynchronous events, and the order in which memory operations are seen by other processors or memory access mechanisms. An instruction synchronize (ISYNC) instruction causes the processor to discard all prefetched instructions, wait for any preceding instructions to complete, and then branch to the next sequential instruction (which has the effect of clearing the pipeline behind the ISYNC instruction). For further discussion of the ISYNC instruction, refer to xe2x80x9cPowerPC 603 RISC Microprocessor User""s Manual, Copyright 1994, Motorola, Inc. and International Business Machines Corp.xe2x80x9d, pp. 4-71, 4-72, 5-24 and 11-83, and xe2x80x9cPowerPC System Architecture, Tom Shanley, Copyright 1995 by MindShare, Inc.xe2x80x9d, p. 113, which are hereby incorporated by reference herein.
Prior art processors unconditionally flushed all speculative instructions that were dispatched or fetched after an ISYNC instruction. xe2x80x9cFlushxe2x80x9d refers to the clearing of the processor of instructions. This unconditional flush action resulted in flushing and restarting the instruction stream several times for every one thousand instructions. The more speculative the design allows, the more performance loss this flush action caused. Furthermore, such flush actions can result in an average of over a dozen cycles in penalty for subsequently re-retrieving the instructions.
The ISYNC instruction is often used as a barrier in lock sequences. One example is illustrated below using PowerPC instructions.
In this example, a critical section is protected and a process can enter only if it obtains a lock. The larwx instruction establishes a reservation (the lock to the critical section), the stcwx instruction checks if the reservation is obtained, the bne instruction branches back to the larwx instruction if the reservation is not obtained. If the reservation is obtained, the branch will be unsuccessful and this process is allowed to proceed by executing the isync and subsequent instructions, which include a ld instruction. The isync is needed to ensure that the ld instruction can only be executed after the larwx instruction has executed completely (its data has been fetched and returned). Prior implementation of the isync cause a flush and refetch of the instruction after the isync thus ensuring that the ld instruction will execute after the larwx instruction.
This type of usage is implemented often, and removing the flush penalty can improve the performance of the processor. Therefore, there is a need in the art for improving upon the flush penalty associated with the execution of ISYNC instructions.
In the present invention, an ISYNC instruction does not cause a flush of speculatively dispatched or fetched instructions (instructions that are dispatched or fetched after the ISYNC instruction) unconditionally. The present invention detects the occurrence of any instruction that changes the state of the machine and requires a context synchronizing complete; these instructions are called context-synchronizing-required instructions. When a context-synchronizing-required instruction completes, the present invention sets a flag to note the occurrence of that condition; this flag is referred to as a context-synchronizing-required flag. When an ISYNC instruction completes afterward, the present invention causes a flush and refetches the instructions after the ISYNC if the context-synchronizing-required flag is active. The present invention then resets the context-synchronizing-required flag. If the context-synchronizing-required flag is not active, then the present invention does not generate a flush operation. The present invention thus reduces the frequency of the flush action significantly, thus improving the performance of the processor.
Furthermore, the context-synchronizing-required flag is not set when the context-synchronizing-required instruction is dispatched. Instead, the flag is set when the corresponding instruction completes. This implementation solves the problem of speculative context-synchronizing-required instructions setting the context-synchronizing-required flag and then aborting.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention.