Achieving high performance in microprocessors based on hardware techniques usually requires complex structures, which consume relatively high power. The first level data cache is one such structure that epitomizes the contention in processor design between power and performance. For instance, adding an extra read port to a data cache allows multiple loads to be issued in parallel, enhancing performance. However, adding an extra port increases the power consumption and requires complex circuitry.
In the alternative, instead of providing more data cache bandwidth with a brute force manner, a total number data cache accesses may be cut down to conserve power. In some implementations if the number of data cache accesses is substantially reduced, processors with a single read port data cache may perform as well as or close to processors with a two read port data cache without dissipating the extra power.
One way to reduce cache access includes identifying loads that do not require cache access. As a first example, a hardware predictor may be utilized to identify these loads. However, the complex circuitry for a hardware predictor and the power consumption of such logic potentially results in more power consumption.