In a multiprocessor design, a DMA mechanism such as a DMA engine or DMA controller is used to move information from one type of memory to another memory type, which is not inclusive of the first memory type (for example a cache), or from one memory location to another. In particular, a DMA mechanism moves information from a system memory to a local store of a processor. When a DMA controller tries to move information from a system memory to a local store of the processor, there can be delay in fetching and loading the information from the memory to the local store of the processor. Moving the information can consume multiple processor cycles. The delay is an accumulation of many factors, including memory latencies and coherency actions within a multiprocessor system. Even in a single processor system, memory access can consume a large number of cycles. In a multiprocessor system with multiple types of memories and relatively large distances between some of the memories and processors, the problem of a processor or DMA controller waiting for memory access is even worse.
A processor can be provided with a cache to help reduce the delay in access to the memory for the processor, thus improving the performance of software running on the processor. The processor may provide instructions for managing the cache to further improve the performance.
Therefore, a need exists in a multiprocessor system for the software program management of caches through the use of a direct memory access (DMA) mechanism, to reduce the latency of memory access on DMA transfers.