Solutions to information problems are needed in most optimizing and parallelizing compilers and software development environments. Compiler optimization problems are typically formulated as data flow frameworks, in which the solution of a given problem at a given program point is related to the solution at other points (Rosen, B. K. JACM 26(2):322-344 (April 1979); Tarjan, R. Journal of the Association for Computing Machinery 28(3):594-614 (1981)). The quality and speed of evaluating these frameworks are well-understood, and data flow methods are understandably prevalent in most optimizing compilers. Unfortunately, propagation methods commonly used in data flow evaluation are unduly inefficient with respect to time and/or space.
Static Single Assignment (SSA) Form has recently yielded more efficient and powerful solutions for some data flow problems (Cytron et al., Sixteenth Annual ACM Symposium on Principles of Programming Languages:25-35 (January 1989) is hereby incorporated by reference herein in its entirety). Characteristic problems solved have been constant propagation (Wegman et al., Conf. Rec. Twelfth ACM Symposium on Principles of Programming Languages:291-299 (January 1985)), global value numbering (Bowen et al., Fifteenth ACM Principles of Programming Languages Symposium:1-11 San Diego, Calif., (January 1988)), and invariance detection (Cytron et al., Conf. Rec. of the ACM Symp. on Principles of Compiler Construction (1986)). Once programs are cast into SSA form, some data flow solutions for these problems have the following advantages: (1) Information is combined as early as possible, (2) information is forwarded directly to where it is needed, (3) useless information is not represented. These advantages follow from the way definitions are connected to uses in a program. The extant SSA-based data flow solutions essentially use SSA form as a sparse evaluation graph that embodies this connection. Unfortunately, SSA form is not sufficiently general to afford an efficient solution for problems not based on this connection, such as Live Variables.
Previous methods for solving data flow problems fall into one of two categories. Traditional bit-vectoring methods propagate the solution at a given node to the control flow graph successors or predecessors of that node (Kildall, G., Conference Record of First ACM Symposium on Principles of Programming Languages:194-206 (January 1973)). Compiler writers generally acknowledge that bit-vectors are overly consumptive of space. Moreover, propagation occurs throughout a graph, sometimes in regions that neither affect nor care about the global solution.
The other prevalent solution method uses direct-connections that shorten the propagation distance between nodes that generate and use data flow information. Such solutions are typically based on def-use chains (Aho et al., Compilers: Principles, Techniques, and Tools Addison-Wesley (1986)). Def-use chains omit nodes from the flow graph that need not participate in the evaluation. Regularly and unfortunately, direct-connections require combining the same information at each use of a particular variable, rather than just once. In the worst case, a quadratic number of such "meets" can occur where a linear number suffices. Experiments show that such quadratic behavior is noticeable especially for arrays, aliased variables, and variables modified at procedure call sites. Once established, direct connections allow propagation directly from sites that generated information to sites that use information. Although information does not propagate unnecessarily through the graph, the same information could be combined many times whereas earlier combining would be cheaper.
SSA form is a direct connection structure which combines the best of each of the mechanisms. Moreover, its creation does not require the costly traditional methods. However, it can only be used to solve a limited class of data flow problems, such as constant propagation, code motion, and value numbering problems; these problems all require connections from definitions to uses. Def-use propagation based on SSA form (Cytron et al., Sixteenth Annual ACM Symposium on Principles of Programming Languages:25-35 (January 1989)) or its precursors (Reif et al., Conf. Rec. Fourth ACM Symposium on Principles of Programming Languages (1977); Reif et al., SIAM Journal of Computing 11(1):81-93 (February 1982)) can usually avoid this expense by combining information as early as possible. However, if def-use chains are explicitly computed by solving Reaching Definitions and Live Variables, then bit vectors would still be required.