A data processor to which the present invention is applicable employs a two-level memory subsystem. The level one memories include instruction cache (L1I) and data cache (L1D) and the level two memory contains directly addressable memory (SRAM), level two cache or both. The SRAM at level two can be cached within level one. Direct memory access (DMA) units can directly access the SRAM at level two. Keeping CPU and DMA access to the level two memory coherent is important to the programmability of the device. Making this efficient is important to the performance of the device.
L1D cache misses are asynchronous to DMA writes to L2 SRAM. On DMA writes in the prior art the L2 controller sends snoop writes to L1D forcing DMA data to L1D cache. There is a window between when L2 updates its L1D cache shadow tag and when L1D actually caches the data. When processing a DMA write it is possible that L2 memory thinks that the line is cached in L1D and sends snoop writes, while the fetch data from a L1D cache miss has still not landed in the L1D cache. This hazard can cause cache coherency issues.