1. Field
The present disclosure relates generally to memory access in a computing system and, more specifically, to initiating CPU data prefetches and preprocessing by an external agent to increase the performance of memory dependent operations.
2. Description
Central Processing Units (CPUs) typically implement prefetches in hardware in order to anticipatorily fetch data into the CPU caches. This helps in reducing the latency of a memory access when the program executing on the CPU actually requires the data. Due to the prefetch, the data can be found in cache with a latency that is usually much smaller than system memory access latency. Modern prefetching hardware tracks spatial and temporal access patterns of memory accesses and issues anticipatory requests to system memory on behalf of the CPU. However, prefetching hardware associated with a CPU normally cannot be invoked by an external agent such as another CPU, a chipset or an Input/Output (I/O) device. When an external agent has new data, a typical protocol requires that all CPU caches (except the CPU who has the new data if the external agent is a CPU) invalidate their copies and read the new data from the memory when they need the new data. In other words, whenever a CPU other than the external agent needs the new data, the CPU must read the new data from the memory (or possibly from the external agent) and thus incur a much higher latency than directly reading from the CPU's own cache. As a result, the data processing speed may be slowed down due to memory access.