In general, cache is used to duplicate a certain part of main memory, so that the duplicated part in the cache can be accessed by a processor core or a central processing unit (CPU) core in a short amount of time and thus to ensure continued pipeline operation of the processor core.
Currently, cache addressing is based on the following ways. First, an index part of an address is used to read out a tag from a tag memory. At the same time, the index and an offset part of the address are used to read out contents from the cache. Further, the tag from the tag memory is compared with a tag part of the address. If the tag from the tag memory is the same as the tag part of the address, called a cache hit, the contents read out from the cache are valid. Otherwise, if the tag from the tag memory is not the same as the tag part of the address, called a cache miss, the contents read out from the cache are invalid. For a multi-way set associative cache, the above operations are performed in parallel on each set to detect which way has a cache hit. Contents read out from the set with the cache hit are valid. If all sets experience cache misses, contents read out from any set are invalid. After a cache miss, cache control logic fills the cache with contents from lower level storage medium.
Cache miss can be divided into three types: compulsory miss, conflict miss, and capacity miss. Under existing cache structures, except a small amount of pre-fetched contents, compulsory miss is inevitable. But, the current prefetching operation carries a not-so-small penalty. Further, while a multi-way set associative cache may help reduce conflict misses, the number of way set associative cannot exceed a certain number due to power and speed limitations (e.g., the set-associative cache structure requires that contents and tags from all cache sets addressed by the same index are read out and compared at the same time). Further, with the goal for cache memories to match the speed of the processor core, it is difficult to increase cache capacity. Thus, multiple layers of cache are created, with a lower layer cache having a larger capacity but a slower speed than a higher layer cache.