1. Technical Field
The present invention relates generally to information processing systems and, more specifically, to layout of data for an application program in order to optimize data cache performance.
2. Background Art
Most programming languages support the use of local procedure variables as well as global program variables. Moreover, several programming languages, such as the Java™ programming language, are object-oriented programming languages that support the notion of objects. These objects may contain one or more fields. Similarly, many programming languages, such as the C/C++™ languages, support the notion of structures that contain one or more fields. The fields of an object or structure may themselves include other objects or structures, respectively.
As a software application processes data, it often pulls data from a data cache. If the desired data is not present in the cache, then a time-consuming memory fetch is performed. For instance, if a local or global variable is not in the data cache when needed by an application, then the variable is fetched from memory. Of course, data cache performance is enhanced when a single data cache-line fetch pulls in multiple variables needed by the program, thereby decreasing the number of necessary cache fetches.
Similarly, the performance of a system running an application that processes large amounts of data in objects or structures critically depends on the performance of its data cache. A large class of applications, such as data-base servers and compilers, process large volumes of data that are typically organized into many different types of records, including objects or data structures. Known efforts to improve the performance of, or “optimize”, the data cache nearly always focus on loop transformations that improve the performance of numerical or scientific code. Numerical code most often manipulates large arrays of data and thus has opportunities to benefit from temporal and spatial locality. Loop transformations use dependence analysis to increase the data locality while maintaining the application's program semantics.
However, these data cache optimization efforts are not usually effective in the cases of integer code, for instance, or other code that includes a large number of branches that are hard to predict. Also, there are currently very few known techniques for improving the data locality of integer applications that heavily use pointers and structures. The few known techniques strive to align structure fields based on the fields' types, but do not choose a layout structure based on the application's temporal behavior.
Embodiments of the method and apparatus disclosed herein address these and other concerns related to enhancing data layout for in an application in order to improve data cache performance.