Computer systems operate by executing instruction sequences that form a computer program. These instructions sequences are stored in a memory subsystem, along with any data operated on by the instructions, both of which are retrieved as necessary by a processor, such as a central processing unit. The speed of CPUs has increased at a much faster rate compared to the memory subsystems upon which they rely for data and instruction code, and as such, memory subsystems can be a significant performance bottleneck. While one solution to this bottleneck would be to primarily use in a computer system only very fast memory, such as static random-access memory, the cost of such memory would be prohibitive. In order to balance cost with system performance, memory subsystem architecture is typically organized in a hierarchical structure, with faster expensive memory operating near the processor at the top, slower less expensive memory operating as storage memory at the bottom, and memory having an intermediate speed and cost, operating in the middle of the memory hierarchy.
Further techniques can be implemented in order to further improve the efficiency of a memory hierarchy. For example, cache buffering of data between memory levels can reduce the frequency that lower speed memory is accessed. Additionally, differences in the granularity of a line of data between memory levels of the hierarchy can increase the efficiency of data movement between these levels by increasing the amount of data that is written to, or read from, the slower memory tier per memory request.