The present invention relates to an architecture for cache hierarchies in a computing system.
Contemporary processor designs often address bandwidth constraints imposed by system buses by providing on-chip or on-package cache memory. Various caching schemes are known. Some schemes provide a single unified cache that stores both instruction data and variable data. Other schemes provide multi-level cache hierarchies in which a lowest level cache, the cache closest to the processor core, is very fast but very small when measured against higher level caches. The higher level caches may be increasingly larger and, therefore, may be slower than their lower level counterparts, but their greater capacity tends to reduce data evictions rates and extend the useful life of cached data. To improve hit rates and to avoid cache pollution, still other caching schemes may provide separate instruction and data caches, which enable higher hit rates due to increased code or data locality found in typical applications' code and data streams.
Future microprocessors may employ super scalar designs in which multiple instructions will be executed in a single cycle. In this domain, a cache hierarchy that responds only to a single data request per cycle may become a bandwidth-limiting element within the processor. Although some known cache schemes, such as multi-ported or multi-banked caches, can accommodate multiple load requests in a single cycle, these known schemes are difficult to implement. Multi-port architectures are complex and difficult to build as separate circuitry is required to accommodate each additional port. Multi-banked architectures introduce complex circuitry to recognize and manage bank conflicts. Accordingly, the inventors perceive a need in the art for a low-complexity cache hierarchy in which a single level of cache may respond efficiently to multiple load requests per cycle.