A data processing apparatus will typically include processing logic for executing sequences of instructions in order to perform processing operations on data items. The instructions and data items required by the processing logic will generally be stored in memory, and due to the long latency typically incurred when accessing memory, it is known to provide one or more levels of cache within the data processing apparatus for storing some of the instructions and data items required by the processing logic to allow a quicker access to those instructions and data items. For the purposes of the following description, the instructions and data items will collectively be referred to as data values, and accordingly when referring to a cache storing data values, that cache may be storing either instructions, data items to be processed by those instructions, or both instructions and data items. Further, the term data value is used herein to refer to a single instruction or data item, or alternatively to refer to a block of instructions or data items, as for example is the case when referring to a linefill process to the cache.
Significant latencies can also be incurred when cache misses occur within a cache. In the article entitled “Just Say No: Benefits of Early Cache Miss Determination” by G Memik et al, Proceedings of the Ninth International Symposium on High Performance Computer Architecture, 2003, a number of techniques are described for reducing the data access times and power consumption in a data processing apparatus with multi-level caches. In particular, the article describes a piece of logic called a “Mostly No Machine” (MNM) which, using the information about blocks placed into and replaced from caches, can quickly determine whether an access at any cache level will result in a cache miss. The accesses that are identified to miss are then aborted at that cache level. Since the MNM structures used to recognise misses are significantly smaller than the cache structures, data access time and power consumption is reduced.
The article entitled “Bloom Filtering Cache Misses for Accurate Data Speculation and Prefetching” by J Peir et al, Proceedings of the Sixteenth International Conference of Supercomputing, Pages 189 to 198, 2002, describes a particular form of logic used to detect whether an access to a cache will cause a cache miss to occur, this particular logic being referred to as a Bloom filter. In particular, the paper uses a Bloom filter to identify cache misses early in the pipeline of the processor. This early identification of cache misses is then used to allow the processor to more accurately schedule instructions that are dependent on load instructions that are identified as resulting in a cache miss, and to more precisely prefetch data into the cache. Dependent instructions are those which require as a source operand the data produced by the instruction from which they depend, in this example the data being loaded by a load instruction.
The article entitled “Fetch Halting on Critical Load Misses” by N Mehta et al, Proceedings of the 22nd International Conference on Computer Design, 2004, also makes use of the Bloom filtering technique described in the above article by Peir et al. In particular, this article describes an approach where software profiling is used to identify instructions which are long latency instructions having many output dependencies, such instructions being referred to as “critical” instructions. In particular, the software profiling techniques are used to identify load instructions that will be critical instructions. For any such load instructions, when those instructions are encountered in the processor pipeline, the Bloom filter technique is used to detect whether the cache lookup based on that load instruction will cause a cache miss to occur, and if so a fetch halting technique is invoked. The fetch halting technique suspends instruction fetching during the period when the processor is stalled by the critical load instruction, which allows a power saving to be achieved in the issue logic of the processor.
From the above discussion, it will be appreciated that indication logic has been developed which can be used to provide an early indication of a cache miss when accessing data, with that indication then being used to abort a cache access, and optionally also to perform particular scheduling or power saving activities in situations where there are dependent instructions, i.e. instructions that require the data being accessed.