Memory devices are addressed through address buses. An address bus can be as wide as the size of the address utilize to locate data in the memory devices. In order to reduce the total number of pins and to simplify the external bus of the memory device, the address bus is typically made more narrow than the address size. For example, a memory device utilizing 32-bit addresses may have an address bus that is 13-bits wide. Accordingly, the narrow bus requires multiple address cycles in order to send the 32-bit address to the memory device. Address patterns to memory devices typically show a locality. The locality results in subsequent memory accesses to targeting a region of memory that share a same high order address. As a result, redundant information may be sent, increasing latency and decreasing performance of memory devices.
Some memory devices reduce the overhead associated with redundant information resulting from multiple address cycles by providing a buffer or cache that is faster than a primary memory array. Portions of the primary memory array are loaded into the cache. Subsequent memory accesses that hit that portion are output from the faster cache. However, context switching among software threads results in flushing and reloading data that may have already been loaded previously. For example, a first software thread loads a first portion of memory into the cache. Due to context switching, a second software thread gains control. The second software thread accessing a second portion of the memory and, accordingly, the cache is flushed and loaded with the second portion. The first software thread needs to reload the cache when the first software thread gains context once again. Therefore, context switching increases latencies and reduces performance of memory devices.