1. Technical Field
The present invention relates to computer systems, and more particularly to a compiler system and method of local data alignment for stack memory for optimizing execution performance.
2. Prior Art
Modern data processing systems often utilize one or more so-called "stack" memories as temporary storage for return addresses, application parameters, local variables and other data which may be utilized during a data processing procedure. Stack memories are utilized in a last-in, first-out (LIFO) manner, and may be referenced either explicitly or implicitly by operating system or application procedures. Typically, an application within a data processing system places any required parameters within the stack memory, invoking a procedure which stores a return address on the stack. Next, local variables defined by specific work routines within the procedure are allocated onto the stack memory. Thereafter, data required by the procedure may be placed on the stack and retrieved during selected operations.
Data placed within the stack memory in a state-or-the-art data processing system is generally retrieved ("fetched") utilizing multi-byte data fetch operations. Most of today's modern microprocessors, such as the Intel Pentium (TM) processors, utilize a fetch instruction comprising four bytes of data aligned on a 4 byte boundary, i.e. 0 mod 4. For floating point operations, modern microprocessors often utilize a fetch operation comprising eight bytes of data aligned on an 8 byte boundary, i.e. 0 mod 8. Eight bytes of data are typically used for a "double precision" floating point word. Similarly, modern processors also store data in memory utilizing a multi-byte operation.
Those skilled in the art will appreciate that a single byte of data can always be retrieved from memory with a multi-byte data fetch instruction, in which the data byte of interest is retained and the other three bytes are ignored. However, when four consecutive bytes of data are required, it is not always possible to retrieve the required four consecutive bytes of data with a single multi-byte data fetch, due to possible misalignment. Similarly, when eight consecutive bytes of data for a double precision floating point fetch are required, the processor may not able to retrieve the required eight consecutive bytes with a single multi-byte data fetch if the bytes are not aligned in the stack memory.
Data which is not aligned on the proper boundary for the particular microprocessor on which the code is running will result in extra instruction cycles being executed to access the misaligned data. The penalty for misaligned data will vary depending on the type of processor, but it will be understood that the penalty in execution makes data alignment an important consideration in compiler design and performance.
As will be understood by those skilled in the art, the proper alignment for file scope data, i.e global data, is relatively straight forward since the global data is statically allocated. However, it is much more difficult to achieve alignment on the stack for function arguments and variables locally scoped to a function which comprise data types with stricter alignment requirements that the default boundary for stack alignment.
In the art, attempts have been made to solve the problem of aligned local variables by adjusting the amount of gross stack space allocated. However, this approach does not address the problem of aligning parameters in the parameter list of the function being called or the problem of aligning local variables with special alignment requirements.
Furthermore, the number of instructions required to maintain alignment of local data on the stack should be minimized. In other words, the cost of keeping function arguments and local variables aligned should be less than the loss in performance resulting from misaligned data.
Accordingly, there remains a need for a technique which provides optimal alignment for at least one selected argument having an alignment requirement that is stricter than the default alignment for the stack memory as provided for the operating system. Furthermore, such a compiler should also accommodate functions that can accept a variable number of arguments.