1. Field of the Invention
The present application relates generally to an improved data processing apparatus and method and more specifically to an apparatus and method for reducing runtime coherency checking with global data flow analysis.
2. Background of the Invention
In heterogeneous multi-core systems, reducing hardware complexity and minimizing power consumption are important design considerations. Providing each of the accelerator cores in such systems with its own fast local memory is one means of accomplishing this goal. Typically, such systems will not provide hardware supported coherence between these local memories and the global system memory. When an application (both code and data) fit within the local memory, good performance can be guaranteed. Such a feature is critical for real time applications. The Cell Broadband Engine Architecture (CBEA) is one example of such a heterogeneous multi-core system. The CBEA includes on a chip a PPE core, and 8 SPE cores each with 256 KB fast local memory, as well as a globally coherent direct memory access (DMA) engine for transferring data between local memories and the shared system memory. Scratchpad memory in embedded computing systems is another example of this type of memory hierarchy. This memory design requires careful programming to use the fast local memory efficiently and reduce long latency accesses to the global memory so as to obtain top performance.