As the need for programs, such as database applications, to process more and more data increases, there is an increasing need for the programs to execute faster. For example, database applications may run for days as they process data in a database. Decreasing the execution time of such applications by even a few percent is highly beneficial.
One technique for improving the performance of programs is to reorder the variables associated with the program to improve the pattern of accessing the variables. For example, typically, the source code files, associated with a particular program, have an area for code and another area for data, such as read-only variables (also known as “constants”) and writeable variables (hereinafter, the term “variables” shall be used to refer to read-only and/or writeable variables). The source code files are compiled to produce object files. Typically there is one object file per source code file. The object files are linked together to produce an executable. When the executable is loaded and executed, the variables of the program reside within a region of computer memory known as the “data segment”.
Frequently, the variables in the “data segment” are cached as the executable is executed to decrease the time it takes the executable to access the variables. Since, cache memory is relatively small in comparison to the size of the “data segment”, choices have to be made as to how long the variables reside in the cache memory.
The order of the variables within the “data segment” impacts the utilization of cache memory. For example, if a frequently-accessed variable X is next to an infrequently accessed variable Y in the “data segment”, the infrequently accessed variable Y maybe loaded (also known as a “fetch”) into the cache memory as a result of the frequently-accessed variable X being loaded into the cache memory. First, loading Y as a part of loading X results in less space in the cache memory for loading other variables that may be accessed more frequently than Y and secondly, may result in another frequently accessed variable Z being removed (also known as “evicting”) from the cache memory. Third, if Z is needed again (also known as a “cache miss”), it will have to be reloaded/fetched into the cache memory. The increase in fetches and cache misses, due to poor variable layout in the “data segment”, increases the execution time of the program.
Most compilers do not reorder variables. Instead, the variables are ordered in the sequence the linker receives the variables. To date, reordering variables has only been done in the research community as a part of scientific and/or numerical programs. The researchers would analyze the source code to determine a better way of ordering the variables. However, it is impossible for people to understand the millions/billions of lines of code in large programs. Therefore, only small portions of the programs can be optimized. Similarly, the result of human analysis is frequently faulty. For example, a person may think, based on the portion of code they were able to comprehend, that variable C should be placed after variables A and B, when in reality it would be better to place variable D after A and B. Furthermore, the poor quality of the variable layout resulted in increasing the size of the executable and in increasing the amount of memory used while running the executable. As mentioned, since human involvement was needed, reordering variables has only been done in the research community for scientific and numerical programs. Therefore, commercial applications, such as database applications, could not utilize this technique.
For these and other reasons, a need exists for providing automatic reordering of variables. A further need exists for providing comprehensive reordering of variables. A further need exists for providing automatic reordering of variables in a manner that can be used in commercial applications.