Many portable products, such as cell phones, laptop computers, personal data assistants (PDAs) and the like, utilize a processing system that executes programs, such as communication and multimedia programs. A processing system for such products may include multiple processors, complex memory systems including multi-levels of caches and memory for storing instructions and data, controllers, peripheral devices such as communication interfaces, and fixed function logic blocks configured, for example, on a single chip. At the same time, portable products have a limited energy source in the form of batteries that are often required to support high performance operations by the processing system and increasingly large memory capacities as functionality increases. To improve battery life, it is desirable to perform these operations as efficiently as possible. However, the scaling of common memory platforms, such as static random access memory (SRAM) and embedded dynamic random access memory (eDRAM), is increasingly constrained by leakage power and cell density. Such concerns extend to personal computer products which are also being developed with efficient designs to operate with reduced overall energy consumption.
A number of memory technologies, such as flash memory, magnetorestive random access memory (MRAM), phase change memory (PCM), resistive RAM (ReRAM), and others, have various limits on the number of write operations that can be performed to the device before memory cells begin to wear out and fail. Memories such as caches, which operate based on principles of spatial and temporal locality and at high data rates, show a wide variability in cache line accesses from program to program and may have cells that experience a very high rate of write accesses. For example, set associative caches which have a plurality of sets of data, each set divided in a plurality of selectable cache ways and each way in a set holding a cache line, experience different write access patterns within each set depending on the program in execution. Also, the write access variability from line to line in a set may be very large and may vary dynamically during system operations.
For example, an 8-way set associative 64 kbyte cache may be constructed having 256 sets of eight 32-byte cache lines per set and access one cache line for each way. Such a cache may be used in a level 1 data cache in a portable device, such as a cell phone, tablet, lap top and the like. In the eight way set associative cache for a program X, write access to a line of data in way 2 may occur multiple orders of magnitude more often than write access to a line of data in a different cache way, such as way 7. Thus, the line of memory internal to the cache for way 2 may fail much earlier than the memory line in way 7 and most other lines in the cache having write access rates lower than the write access rates of way 2. The memory wear endurance affects each level of a memory hierarchy, such as level 1, 2, and 3 caches, flash memory, and system memory, though to different degrees in each level and each device. With processors running in the gigahertz (GHz) frequency, caches experience a large number of write accesses which may be to specific lines in the cache and thus such locality of accesses may cause a system to too rapidly approach the write limits of the cache memory. Since any cache line may experience high write operations depending on the program in execution and such a cache line hotspot is not known in advance, the cost for monitoring every cache line to determine which cache line in a cache set is affected and should be relocated to reduce wear may be prohibitive.