The increasing gap between processor and main memory speeds has forced computer designers to exploit cache memories. A cache memory is usually smaller than the main memory, and, if properly managed, can hold a major part of the working set of a program. The working set being the instructions of the program that are immediately being executed.
The goal of memory subsystem designers is to improve the average memory access time. Reducing the cache miss rate is one factor for improving memory access performance. Cache misses occur for a number of reasons: cold start, lack of capacity, and collisions. A number of cache line replacement algorithms have been proposed to reduce the number of cache misses.
Some prior art methods have concentrated on the layout of the instructions of the program onto the addresses of the cache memory. For example, a dynamic remapping of cache addresses has been suggested to avoid conflicts in large direct-mapped caches. In an alternative approach, instructions are repositioned at compile or link-time. There, the idea is to place frequently executed portions of the program at adjacent addresses of the cache memory. Thus, the chances of cache conflicts are reduced while increasing spatial locality within the program.
Code reordering algorithms for improved memory performance can span several different levels of granularity, from basic blocks, to loops, procedures, and entire programs. It is desired to reorder instructions of programs to significantly improve a program's execution performance.