1. Field of the Invention
The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms for generating optimized code that targets a high locality software cache.
2. Background of the Invention
Multi-core systems are becoming more prevalent in today's computing environments. A multi-core system combines two or more independent cores, or processors, into a single package composed of either a single integrated circuit (IC) or multiple ICs packaged together. For example, a dual-core system contains two cores while a quad-core processor contains four cores. Cores in such a multi-core system may share a single coherent cache at the highest on-device cache level (e.g., a L2 cache for the Intel Core 2) or may have separate caches (e.g., the current AMD dual-core processors). The processors, or cores, also share the same interconnect to the rest of the system. Each core independently implements optimizations such as superscalar execution, pipelining, and multithreading. The most commercially significant multi-core processors are those used in personal computers and gaming consoles, e.g., the Cell Broadband Engine (CBE) available from International Business Machines Corporation of Armonk, N.Y., which is presently used in the Playstation 3 gaming console available from Sony Corporation.
The amount of performance gained by the use of a multi-core system depends on the problems being solved and the algorithms used, as well as their implementation in software. For example, for some parallel problems, a dual-core processor with two cores running at 2 GHz may perform very nearly as fast as a single core of 4 GHz. However, other problems may not yield as much of a speed-up from the use of multiple cores. Even if such a speed-up is not achieved, the system will typically perform multitasking more efficiently since it can run two or more programs at once, one on each core.
Ease of programming is one of the main impediments for the broad acceptance of multi-core systems. This is because present multi-core systems do not have hardware support for transparent data transfer between local and global memories. To address this issue, software caches have been used as a robust approach to provide the user with a transparent view of the memory architecture. A software cache is a hardware cache that is managed by software. While software caches allow local and global memories to be viewed together as a single memory device, software cache approaches can suffer from poor performance for a variety of reasons.