A compiler is a mechanism that translates a source code program written in a high-level, human-readable programming language, into an equivalent intermediate representation for the program, that is, in a machine language that can be executed by a computer. An example of a compiler is an IBM XL compiler and an example of a generated intermediate representation is W-code, both of which are available by the IBM (International Business Machines) Corporation. The intermediate representation provides a stack-based representation of expressions in the program.
Note that the intermediate representation is often not provided in a form that can be executed by a machine. That is, the intermediate representation is more abstract than machine language. The intermediate representation is easier to process in compiler optimizations, because it is usually defined over some abstract machine (e.g., a “stack machine”), which is easier to manage and maintain the semantics, than optimizing an Abstract Syntax Trees representation (which is much closer to how the source program appears). The Intermediate Representation is also convenient for finally translating it into a machine language that can be executed by a computer.
A pattern is a reoccurring set of events or objects that repeat in a periodic fashion. Pattern matching is the act of checking for the presence of constituents of a given pattern. Pattern matching is used to test whether objects or applications have a desired structure. Pattern matching is also utilized to determine the relevant structure, and to retrieve the aligning parts, and also to substitute the matching part with something something else, Pattern matching an intermediate representation is a common technique for locating predictable statements and expressions and retrieving specific elements, in order to create derived expressions. This technique assists in identifying a loop that includes an induction variable, which adds a constant value to a variable per loop iteration. Therefore, pattern matching of intermediate representation is a useful technique in loop optimization.
Most existing pattern matching code, however, is hand crafted for specific patterns. While a generalized pattern matching code or pattern matcher for a given pattern may be written, it is cumbersome for the programmer. In addition, existing pattern matching code does not provide the capability of easily constructing complex pattern matchers and pattern transformers as objects using grammatical building blocks.
The Expression Matching and Transformation Framework (EMTF) can be utilized to easily define pattern matchers for inputs in the W-Code Intermediate Language. The advantages of EMTF are that the patterns defined in the framework can be easily embedded within the compiler code, and are similar to the abstract representation of W-Code expressions. Unification can also be utilized as a main tool for matching and retrieval. In the EMTF framework, operators can be defined to enhance patterns with logic.
For example, consider patterns p1 and p2. In such a situation, the expression formed by utilizing the ‘or’ operator as indicated by symbol “∥” i.e., p1∥p2 can match with the input expression tree if and only if p1 matches with the input expression tree, or p2 matches with the input expression tree. EMTF also enables creating patterns dynamically (e.g., creating patterns in a memory pool). This feature can be utilized to create, for example, an “∥ . . . ∥ . . . ∥ . . . expression” (i.e., an ‘or’ expression of multiple patterns) dynamically without having to specify the entire combined pattern in advance.
FIG. 3 illustrates a prior art program execution sequence 300 for creating patterns utilizing expression matching and transformation programming framework (EMTF). As depicted in FIG. 1, a locally controlled memory pool 310 can be declared and and the content of the local pool 310 can be cleared from memory at the end of current scope. A first pattern 320 can be declared for adding 1 to a symbol s, and the first pattern 320 can be copied to the local pool 310. A second pattern 330 can be declared for adding 1 to the symbol s, and the second pattern 330 can also be copied to the local pool 310. A combined pattern 340 of first pattern 320 and second pattern 330 can also be created. The pattern elements that are already in the local pool 310 cannot be copied again, that is, each pattern element contains a single instance in a given memory pool. A large ‘or’ pattern of many patterns such as the first pattern 320 and the second pattern 330 can also be created in the program execution sequence 300 where the matching process sequentially tries one pattern after the other.
The problem associated with the program execution sequence 300 is that, if the resulting ‘or’ pattern includes hundreds of patterns, utilizing the resulting pattern can become very inefficient and it is also hard to manage the local memory pool 310. In addition, when the pattern is matched against an input by EMTF design, it stays matched (i.e. unifiable variables are bound to elements in the input until the pattern is unbound). Hence, if the pattern needs to be utilized multiple times before unbinding the pattern, such as, for example by multiple threads or in a recursive match transform sequence, multiple copies of the pattern need to be created. This results in an inefficiency of adding more copy process, which consequently increases compiler memory footprint.
Based on the foregoing it is believed that a need exists for an improved method, apparatus, and computer program product for efficient multiple-pattern based matching of intermediate representation and transformation of intermediate language expression trees, such that pattern matching allows nesting searches and transforms within one-another, and can be embedded within a source program.