The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms for predicting which instruction cache ways are going to be utilized in a high power application and powering up only those instruction cache ways.
An instruction cache is a memory associated with a processor, which is smaller and faster than main memory, and is used to increase the speed by which the processor fetches instructions for execution. Instructions are typically retrieved from main memory and placed in the instruction cache so that they may be more quickly fetched by the processor when needed. Instruction caches may be either direct mapped caches, fully associative caches, or set associative caches. A direct mapped cache is one in which each location in main memory can be cached by only one cache location. A fully associative cache is one in which each location in main memory can be cached in any cache location. A set associative cache is one in which each location in main memory can be cached in any one of a subset of the cache locations in the cache, e.g., a 2-way set associative cache allows a location in main memory to be cached in one of 2 cache locations within the cache. When searching a cache for a particular instruction, in a direct mapped cache the processor only needs to search one cache location. In a fully associative cache, the processor would need to search all of the cache locations. In a set associative cache, the processor only needs to search a subset of cache locations based on the number of “ways,” or cache lines, in which the memory location could be stored in the cache. Many modern instruction cache structures use a set associative cache approach.
With modern computing systems, the instruction cache can represent a significant amount of power consumption in the processor. Taking a set associative cache for example, because it cannot be known ahead of time which ways, i.e. cache lines, of the cache are going to be needed for a particular instruction fetch, all of the possible ways must be powered up. Thus, for example, in a 4-way set associative cache, i.e. a cache in which a memory location may be in any one of 4 possible cache locations, all 4 ways must be powered up in order to determine if the desired instruction is present in any one of the 4 ways. Because of this, as well as other timing and performance reasons, the instruction cache can represent a significant amount of the power consumption of the instruction fetch unit of the processor. For example, in some architectures, the instruction cache can represent approximately 30% of the power consumption of the instruction fetch unit of the processor.
For many modern processor chips, the frequency at which the processor can operate is dictated by how much power is used for worst case applications. These worst case applications are generally high performance scientific applications. To simulate these worst case applications, a power benchmark is typically used. These power benchmarks, as well as many other applications used during actual operation of the processor, are essentially code comprising a loop that executes over and over again to stress the resources of the processor.