Self-modifying code presents a special problem to modern microprocessors that utilize separate instruction and data caches. In this design, the data cache lacks the necessary information to determine whether or not data that is modified affects instruction storage, and thus whether prefetched instructions need to be discarded. Special interlocks between the data cache and the instruction cache can be designed to detect whether prefetched instructions should be discarded due to modified store results.
The interlocks utilized by processors to detect when a program is storing into its instruction stream are known as “program-store-compare” (PSC). In an architecture that allows self-modifying code, such as the IBM System/Z processors, this can be a very important logic path, as any store could cause the processor to discard prefetched instructions.
Conventional designs have implemented PSC by assuming that most speculative store execution to lines are resolved in a timely manner with a flush after a store completion to that line. As a result, if there is no store completion to that line within some relatively short arbitrary time period, the processor assumes that there could be some future store completion to that line and flushes the pipe at that time to free PSC tracking resources.
In an out-of-order execution design, in many cases there may be a long delay between any speculative store execution to a line and the first store completion to that line. With out-of-order branch resolution, it is also possible for a storing instruction that initiates PSC tracking to be down an incorrect branch path and be flushed without the knowledge of the tracking logic, in which case the tracking logic may continue to track without ever seeing a store completion to that line. Flushing the pipe in a relatively short arbitrary amount of time in order to free PSC tracking resources can cause unnecessary multiple premature pipe flushes for a single PSC event and result in poor performance. Flushing the pipe in a relatively long amount of time (or not forcing a flush at all) can cause PSC tracking resources to be reserved longer than necessary. This results in poor performance due to pipe stalls waiting for PSC tracking resources, or can even result in potential processor hang scenarios with PSC tracking resources never getting freed.