Field of the Invention
The present invention generally relates to computer processing, and, more specifically, to a demand-driven algorithm to reduce sign-extension instructions included in loops of a 64-bit computer program.
Description of the Related Art
Developers use compilers to generate executable programs from high-level source code. Typically, a compiler is configured to receive high-level source code of a program (e.g., written in C++ or Java), determine a target hardware platform on which the program will execute (e.g., an x86 processor), and then translate the high-level source code into assembly-level code that can be executed on the target hardware platform. This configuration provides the benefit of enabling the developers to write a single high-level source code program and then target that program for execution across a variety of hardware platforms, such as mobile devices, personal computers, or servers.
In general, a compiler includes three components: a front-end, a middle-end, and a back-end. The front-end is configured to ensure that the high-level source code satisfies programming language syntax and semantics, whereupon the front-end unit generates a first intermediate representation (IR) of the high-level source code. The middle-end is configured to receive and optimize the first IR, which usually involves, for example, removing unreachable code, if any, included in the first IR. After optimizing the first IR, the middle-end generates a second IR for the back-end to process. In particular, the back-end receives the second IR and translates the second IR into assembly-level code. The assembly-level code includes low-level assembly instructions that are directly-executable on a processor that is part of the target hardware platform.
In some cases, programs execute in “64-bit mode,” where base memory addresses are 64-bit values (e.g., int64 variables) and memory offsets are 32-bit values (e.g., int32 variables). As a result, a typical address computation, e.g., of a particular of an index of an array, requires adding a 32-bit memory offset to a 64-bit base address. For the processor to perform this addition, the processor must first convert the 32-bit memory offset to a 64-bit memory offset so that the number of bits associated with the memory offset is in alignment with the number of bits associated with the 64-bit base address. Such conversion is referred to herein as “sign-extension,” which, in particular, involves increasing the number of bits of a binary number while preserving the number's sign (i.e., positive/negative) and value.
Although a sign-extension operation is not a substantially expensive operation, a sign-extension operation included in a loop inhibits important loop optimization known as “loop strength reduction.” Notably, nearly all code that executes in “64-bit mode” includes a considerable number of loops, and many of these loops include sign-extension instructions. One technique for eliminating a sign-extension of a 32-bit variable to a 64-bit variable within a loop involves converting the variable to a 64-bit variable outside the loop by performing a sign-extension in a preheader of the loop and replacing all 32-bit operations on the original 32-bit variable with 64-bit operations on the promoted variable. This transformation, however, makes an important assumption that none of the original 32-bit operations causes integer overflow. This assumption is valid for common programming languages like C and C++ where any program relying on overflow of signed-arithmetic operators is undefined. On architectures where 64-bit registers and operations have no extra cost, this optimization is always a win. However, on architectures where 64-bit registers and operations require addition resources, careful consideration must be used, and a cost benefit analysis to selectively choose this optimization is desirable, since usage of 64-bit registers and operations can increase register pressure and consume more cycles.
Accordingly, what is needed in the art is a technique for identifying sign-extension instructions for elimination using cost-benefit analysis and a method for performing the transformation on low level intermediate representation (IR) of a program.