In a multi-processor computing system, each processor has its own cache to store a copy of data that is also stored in the system memory (i.e., the main memory). A cache is a smaller, faster memory than the system memory, and is generally located on the same chip as the processors. Caches enhance system performance by reducing off-chip memory accesses. Most processors have different independent caches for instruction and data. The data cache is usually organized as a hierarchy of multiple levels, with smaller and faster caches backed up by larger and slower caches. In general, multi-level caches are accessed by checking the fastest, level-1 (L1) cache first; if there is a miss in L1, then the next fastest level-2 (L2) cache is checked, and so on, before the external system memory is checked.
One of the commonly used cache writing policies is called the “write-back” policy. With the write-back policy, a processor writes a data item only to its local cache. The write to the system memory is postponed until the cache line containing the data item is about to be replaced by another cache line. Before the write-back operation, the cache content may be newer and inconsistent with the system memory content which holds the old data. To ensure that the system memory stores the most up-to-date data, the cache content may be flushed (i.e., written back) into the system. Cache flushing may occur when a block of data is requested by a direct-memory access (DMA) request, such as when a multimedia application that runs on a video processor needs to read the latest data from the system memory.
However, the applications needing the memory data may be blocked until the cache flushing operation completes. Thus, the latency of cache flushing is critical to the user experience. Therefore, there is a need for improving the performance of cache flushing.