In modern processors, execution pipelines are often used. Instructions are provided to the front end of the pipeline by various arrays, buffers, and caches. Such front-end arrays that contain instruction lines, may also includes self-modifying code (SMC) bits to detect which instruction lines may have been overwritten by self-modifying or cross-modifying code.
It will be appreciated that for correct functionality considerations such as processor inclusion, any instruction line that has been delivered into the execution pipeline may later need to be re-delivered in an unmodified state. Therefore, deallocation or eviction of the line, in particular from an instruction cache, cannot take place until all instructions from that line are no longer being processed in the execution pipeline.
One technique to protect such instruction lines from being evicted is to employ a victim cache to hold evicted lines until it can be determined that no instructions from that line are being processed in the execution pipeline. One way to make such a determination is to insert a special micro-operation into the pipeline when an entry is allocated into the victim cache. When that micro-operation retires in sequential order, any instructions from that line that were in front of the micro-operation will have been retired as well and the corresponding entry can be deallocated from the victim cache.
Design constraints may limit a victim cache to store only a few entries (e.g. four or eight). If too many instruction lines are evicted from the instruction cache prior to a victim cache deallocation, the victim cache can fill up resulting in unwanted stalls for the execution pipeline. Furthermore, insertion of numerous special micro-operations into the execution pipeline may cause further degradation to overall performance. Especially for heavy workloads of new instructions where poorly predicted branching may occur, the performance degradation may be significant.
Increasing the victim cache size may reduce the number of stalls, but may not improve upon the numerous special micro-operations inserted into the execution pipeline. Moreover an increased victim cache size comes only at a tradeoff of reducing available area for other circuitry and potentially increasing critical timing paths. To date, other alternative solutions have not been adequately explored.