1. Field of the Invention
This invention relates to computer software compilers, and more specifically to a computer software compiler representing aliases and indirect memory operations in static single assignment form.
2. Related Art
Static Single Assignment (SSA) form is a well-known, popular and efficient representation used by compilers for performing analyses and optimizations involving scalar variables. Effective algorithms based on SSA have been developed to perform constant propagation, redundant computation detection, dead code elimination, induction variable recognition, and others.
Contemporary compilers, however, only use SSA for direct memory operations, i.e., accesses to scalar variables, in a program. When applied to indirect memory operations, including the use of arrays and accesses to memory locations through pointers, the representation is not straight forward, and results in added complexities in the optimization algorithms that operate on the representation. This has prevented SSA from being widely used in production compilers.
To better understand the invention, it is useful to describe some additional terminology relating to SSA. During compilation, a source program is translated to an intermediate representation in which expressions are represented in tree form. The expression trees are associated with statements that use their computed results.
In SSA form, each definition of a variable is given a unique version, and different versions of the same variable can be regarded as different program variables. Each use of a variable version can only refer to a single reaching definition. When several definitions of a variable, a.sub.1, a.sub.2, . . . , a.sub.m, reach a merging node in the control flow graph of the program, a .phi. function assignment statement, a.sub.n =.phi.(a.sub.1, a.sub.2, . . . , a.sub.m), is inserted to merge them into the definition of a new variable version a.sub.n. Thus, the semantics of single reaching definitions is maintained. This introduction of a new variable version as the result of .phi. factors use-def edges over merging nodes, thereby reducing the total number of use-def edges required. As a result, the use-def chain of each variable can be provided in a compact form by trivially allowing each variable to point to its single definition.
One important property in SSA form is that each definition must dominate all its uses in the control flow graph of the program. Another important property in SSA form is that identical versions of the same variable must have the same value.
Aliasing of a scalar variable occurs in one of five conditions: when its storage location partially overlaps another variable, when it is pointed to by a pointer used in indirect memory operations, when its address is passed in a procedure call, or when it is a non-local variable that can be accessed from another procedure in a call or return, or when exceptions are raised. Techniques have been developed to analyze pointers both intra-procedurally and inter-procedurally to provide more accurate information on what is affected by them so as to limit their ill effects on program optimizations.
To characterize the effects of aliasing, SAA distinguishes between two types of definitions of a variable: MustDef and MayDef. Because a MustDef must define the variable, it blocks the references of previous definitions from that point on. A MayDef only potentially redefines the variable, and so does not prevent definitions of the same variable from being referenced later in the program. On the use side, in addition to real uses of the variable, there are places in the program where there are potential references to the variable that need to be taken into account in analyzing the program. We call these potential references MayUse.
To accommodate the MayDefs, SSA edges for the same variable are factored over its MayDefs. This is sometimes referred to as location-factored SSA form. This effect is modeled by introducing the .chi. assignment operator in the SSA representation. .chi. links up the use-def edges through MayDefs. The operand of .chi. is the last version of the variable, and its result is the version after the potential definition. Thus, if variable i may be modified, the code is annotated with i.sub.2 =.chi.(i.sub.1), where i.sub.1 is the current version of the variable.
To model MayUses, the .mu. operator is modeled in our SSA representation. .mu. takes as its operand the version of the variable that may be referenced, and produces no result Thus, if variable i may be referenced, the code is annotated with .mu.(i.sub.1), where i.sub.1 is the current version of the variable.
In the compiler's internal representation, expressions cannot have side effects. Memory locations can only be modified by statements, which include indirect and direct store statements and calls. Thus, .chi. can only be associated with store and call statements. .mu. is associated with any dereferencing operation, like the unary operator *in C, which can happen within an expression. Thus, .mu. arises at both statements and expressions. Return statements can also be marked with .mu. for non-local variables to represent their liveness at function exits. Separating MayDef and MayUse allows precise modeling of the effects of calls. For example, a call that only references a variable will only cause a .mu. but no .chi.. The .mu. takes effect just before the call, and the .chi. takes effect right after the call.
The following example describes of the use of .mu. and .chi., together with .phi., in SSA representation. In the example, function func uses but does not modify variable i. ##STR1##