Providing design flexibility in a cache by allowing a variety of size and associativity choices, while maintaining the speed of the cache in locating/storing a requested element, may be highly advantageous for architectures that utilize a cache. Traditionally, there have been three types of cache organizations that have been used: the fully associative, the k-way set associative; and the direct mapped cache organizations.
In a fully associative cache organization, each item of information from a main system memory is stored as a unique cache entry. There is usually no relationship between the location of the information in the cache and its original location in main system memory. Since, each storage location can hold information from any location in main system memory, complex and expensive cache comparison logic may be required to map the complete main system memory space. Furthermore, whenever a processor makes a memory request, each entry in a fully associative cache must be checked to see if the requested information is present (a hit), which forces a fully associative cache to stay extremely small as to not induce extremely large wait states in processing cycles.
The k-way set associative caches allow larger caches to be designed for a lower cost than fully associative caches, because less expensive and complex comparison logic is needed. Typically, a set associative cache divides the cache memory into k banks of memory, which is also known as k ways. To give a simplified example, if a 128 KB set associative cache has 4 ways, then each way may be 32 KB in size. Usually, a set associative cache sees memory as logically broken up into pages, which may be the size of each way. Continuing the example from above, a 256 KB main system memory may be logically viewed by the cache as 8 pages, each having a size of 32 KB.
Every location within a page of memory (such as the first location) may only be stored in the first location of each of the k ways. Therefore, in the example above, the first memory location in all 8 pages may be stored in only the first entry of any of the 4 ways. When a memory request is made, the set associative cache will compare the memory request with only the cache location the memory request would be stored at, in all of the ways. Since, the set associative cache need only compare the single location within each of the ways, the lookup times for memory requests may be much quicker than a fully associative cache. These faster lookup times allow for larger set associative caches to be designed. However, the ability to compare locations in multiple ways still requires complex and expensive comparison logic. For example, a 19 k low level cache with an associativity of 19 may possibly require 19 tag comparators and 19 ECC detectors. Because of the high costs of this circuitry, typically, for a low level cache only a 4 way set associative cache with a cache size of 16 k may be designed. The smaller number of ways may limit flexibility in the total size of the cache and may forfeit the extra 3 k of room in the low level cache.
In addition to the complex and expensive comparison logic, the traditional set associative cache has limitations on the choices for associativity of a cache for a given size. As an example, a traditional 16 k cache may, typically, only be designed with an associativity of 1, 2, 4, 8, or 16.
One variation of the set associative cache that may reduce the complexity and cost of the comparison logic is a direct mapped cache, which is effectively a one way set associative cache. Similar to the set associative cache, the direct mapped cache may view memory as broken into pages the size of the single way. From the example above, a 128 KB cache may have a single 128 KB way and may logically view a 256 KB memory as broken into two pages of 128 KB.
Yet, direct mapped caches may have limitations for some applications. First, when a program accesses two locations in memory, both at the same location in separate logically viewed pages, the direct mapped cache may have to update that location with every memory request (also known as thrashing). Thrashing eliminates any benefit of having a cache. Second, a direct mapped cache may be limited to a multiple of a power of two in design, since it contains only 1 way of 2s sets. As an example, if a processor has room on the die for a six megabyte (6M) cache and a direct mapped organization is used, only a 4M cache may be implemented.