A data processor to which the present invention is applicable employs a two-level memory subsystem. The level one memory includes instruction cache (L1I) and data cache (L1D) and the level two memory contains directly addressable memory (SRAM), level two cache or both. The SRAM at level two can be cached within level one. Direct memory access (DMA) units can directly access the SRAM at level two. Keeping central processing unit (CPU) and direct memory access (DMA) data transfers to the level two memory coherent is important to the programmability of the device. Making this efficient is important to the performance of the device.