A typical central processing unit (CPU) uses cache memory to reduce the average memory access time. A cache memory is a smaller, faster memory that is disposed between the CPU and main memory. The cache memory stores copies of data from frequently used main memory locations. When the CPU needs to access main memory it first checks whether a copy of the required data is in the cache. If so, the CPU immediately reads from, or writes to, the cache. This improves latency since cache memory access times are faster than main memory access times.
Modern CPUs can have at least three independent caches: an instruction cache to speed up executable instruction fetches, a data cache to speed up data fetches and stores, and a translation lookaside buffer (TLB) used to speed up virtual-to-physical address translation for both executable instructions and data. Various cache and TLB levels can also be used.
Following a context switch, it is necessary to flush the contents of the instruction or data TLB. In systems running multiple operating systems, a complete TLB flush is performed when any guest/host operating system requires a TLB flush. This can occur fairly often as a result of typical virtual memory management and leads to increased latency.