Processors and memories are key components in a computer system to perform various operations based on instructions and data given. As a processor is usually faster than its storage memory, there is a substantial amount of time while waiting for the memory to respond to a memory request. The system performance can degrade as the gap between the operating speeds of the processor and the memory increases. Fast memory is crucial to enhance the performance of computer systems, but is expensive to manufacture. A trade-off solution to this problem is to supply layers of fast local storage memory, namely cache memory, with different speeds and capacities between processors and the main storage devices.
Cache memory is built with fast memory technology. It is expensive and is usually built in small capacity relative to a main memory to save cost. A cache mirrors several segments in the main memory such that the processor can retrieve data from the cache which has faster cycle time.
In general, a cache nearer to a processor is built to perform at a faster speed and is more costly than the cache further down the memory hierarchy. The cache that is closest to a processor is called a level 1 (L1) cache. It is followed by another cache, namely a level 2 (L2) cache and the number increases as it moves down the memory hierarchy. For a cache at any level, the adjacent cache that is located closer to the processor's end is referred to as an upstream cache. A downstream cache refers to an adjacent cache that is located closer to the end of main memory side of the memory hierarchy. For example, a L1 cache is the upstream cache with respect to a L2 cache.
A cache is generally smaller than its downstream caches. During normal operations, contents in a cache will be evicted according to replacement policies to free up space for storing newly fetched data. To increase the performance of a cache, it is important to retain the data that are frequently accessed and to remove data that will not be required in the near future (e.g., data that are only required once). In some cases, the conflicts are inevitable, as the data access pattern is mostly random. On the other hand, some classes of access patterns can trigger a high miss rate depending on cache sizes, data sizes and the reusability of data.
Streaming data refers to one or more chunks of related data that, when combined, are larger than the cache size of a cache storing a portion of the data. The chunks of data can be stored either contiguously or non-contiguously in a memory space. Streaming data can be in various data structures, containing information for different types of content and applications. In most cases, these data are required only once and will be evicted without being reused. If this type of data is treated as other data in a cache, it will cause other important data to be evicted which otherwise would have stayed in the cache. When frequently used data are evicted in favor of data that will not be reused, this is an indication of cache pollution. It is unlikely for a programmer to know the configuration of all different caches located in various computer systems at the time of writing programs and hence it is impossible to tailor the programs for each system configuration to prevent cache pollution.