Many portable products, such as cell phones, laptop computers, personal data assistants (PDAs) or the like, utilize a processor executing programs, such as, communication and multimedia programs. The processing system for such products includes a processor and memory complex for storing instructions and data. Large capacity main memory commonly has slow access times as compared to the processor cycle time. As a consequence, the memory complex is conventionally organized in a hierarchy based on capacity and performance of cache memories, with the highest performance and lowest capacity cache located closest to the processor. For example, a level 1 instruction cache and a level 1 data cache would generally be directly attached to the processor. While a level 2 unified cache is connected to the level 1 (L1) instruction and data caches. Further, a system memory is connected to the level 2 (L2) unified cache. The level 1 instruction cache commonly operates at the processor speed and the level 2 unified cache operates slower than the level 1 cache, but has a faster access time than that of the system memory. Alternative memory organizations abound, for example, memory hierarchies having a level 3 cache in addition to an L1 and an L2 cache. Another memory organization may use only a level 1 cache and a system memory.
A memory organization may be made up of a hierarchy of caches operating as inclusive caches, strictly inclusive caches, exclusive caches, or a combination of these cache types. By definition herein, any two levels of cache that are exclusive to each other can not contain the same cache line. Any two levels of cache that are inclusive of each other may contain the same cache line. Any two levels of cache that are strictly inclusive of each other means that the larger cache, usually a higher level cache, must contain all lines that are in the smaller cache, usually a lower level cache. In a three or more multi-level cache memory organization, any two or more cache levels may operate as one type of cache, such as exclusive, and the remaining cache levels may operate as one of the alternative types of cache, such as inclusive.
An instruction cache is generally constructed to support a plurality of instructions located at a single address in the instruction cache. A data cache is generally constructed to support a plurality of data units located at a single address in the data cache, where a data unit may be a variable number of bytes depending on the processor. This plurality of instructions or data units is generally called a cache line or simply a line. For example, a processor fetches an instruction or a data unit from an L1 cache and if the instruction or data unit is present in the cache a “hit” occurs and the instruction or data unit is provided to the processor. If the instruction or data unit is not present in the L1 cache a “miss” occurs. A miss may occur on an instruction or data unit access anywhere in a cache line. When a miss occurs, a line in the cache is replaced with a new line containing the missed instruction. A replacement policy is used to determine which cache line to replace. For example, selecting or victimizing a cache line that has been used the least represents a least recently used (LRU) policy. The cache line selected to be replaced is the victim cache line.
A cache line may also have associated with it a number of status bits, such as a valid bit and a dirty bit. The valid bit indicates that instructions or data reside in the cache line. The dirty bit indicates whether a modification to the cache line has occurred. In a write-back cache, the dirty bit indicates that when a cache line is to be replaced the modifications need to be written back to the next higher memory level in the memory system hierarchy.
A victim cache may be a separate buffer connected to a cache, such as a level 1 cache, or integrated in an adjacent higher level cache. Victim cache lines may be allocated in the victim cache under the assumptions that a victim line may be needed relatively shortly after being evicted and that accessing the victim line when needed from a victim cache is faster than accessing the victim line from a higher level of the memory hierarchy. With a victim cache integrated in an adjacent higher level cache, a castout occurs when a line is displaced from the lower level cache and is allocated in the higher level cache, thus caching the lower level cache's victims. The lower level cache sends all displaced lines, both dirty and non-dirty, to the higher level cache. In some cases, the victim line may already exist in the victim cache and rewriting already existing lines wastes power and reduces bandwidth to the victim cache.