1. Technical Field
The present invention relates to optimization of code checking and more particularly to systems and methods for optimizing code for dynamically-typed computer languages.
2. Description of the Related Art
In dynamically-typed languages, we cannot statically, i.e., at compile time, distinguish primitive values, e.g., integers, from reference values representing pointers to data structures, e.g., objects. Instead primitive values and reference values need to be distinguished at runtime. In the prior art, types of values are dynamically distinguished in several ways. 1. Tag-bit solution: Includes a tag-bit field in a value indicating whether a value is primitive value, or a reference value. 2. Boxing solution: Primitive values are wrapped (boxed) by reference values, e.g., java.lang.Integer. 3. Some compilers also use the tag-word approach, e.g., Glasgow Haskell Compiler, or Icon.
Soft typing for LISP aims at removing runtime checks. However, soft typing does not remove more significant costs coming from 1) redundant representation of values, and 2) heavy overloading of operators. Some other techniques aim at optimizing representations of dynamic values. Untagging and unboxing are techniques that change the representation of objects when their types are known. The escape analysis technique is a similar technique to permit allocation of an object on a local frame. Many proposals of untagging, unboxing, or escape analysis have limitations when applied to dynamically typed languages. A basic problem in these techniques is that they cannot determine the type of variables where control flow merges, and where different types of values flow into the same variable.
Specialization and devirtualization aim at reducing the cost of calls by permitting specialized functions. Many dynamically typed languages have heavily-overloaded operators. Such overloading should be resolved before calls to such operators are employed or inlined. A similar problem to unboxing and untagging optimization may apply here as well.
SELF is a dynamically typed language, whose compiler uses a splitting optimization which duplicates some part of the control flow of the program for increasing the chance of specialization. In particular, SELF splits the control flow for specific types to achieve devirtualization and inlining. In SELF, each splitting incrementally and locally modifies the code so that it is difficult to control the final code after all splitting is applied. SELF's splitting causes explosion, and some heuristics are needed to avoid this problem.
Loop versioning can duplicate a loop to optimize one version of the loop. However, the effectiveness of loop versioning may be modest, e.g., removal of some array boundary checks inside the loop, which include several machine instructions. Region-based compilation selects a hot-path of the code using profiling, and compiles the code only for a hot path. Region-based compilation itself just skips the compilation of cold paths, so that it does not create a hot path by changing the control flow.