Field
The present invention relates to computing devices. In particular, but not by way of limitation, the present invention relates to compiling or interpreting scripting code.
Background
More and more programs are utilizing source code constructs that are written in high level, dynamically-typed programming languages that must be compiled or interpreted before many other activities (e.g., layout calculations and rendering) associated with the constructs can be executed. By way of example, ECMAscript-based scripting languages (e.g., JavaScript® or Flash) are frequently used in connection with the content that they host. One of the most ubiquitous dynamically-typed languages is JavaScript which is run by a JavaScript engine that may be realized by a variety of technologies including interpretation-type engines, profile-guided just-in-time (JIT) compilation (e.g., trace based or function based), and traditional-function-based JIT compilation where native code is generated for the entire body of all the functions that get executed. Other dynamically-typed programming languages can be run by similar engines.
In virtual machines for dynamically-typed programming languages (e.g., JavaScript), performance is largely determined by characteristics of the global type state. Global type state can be thought of as a description of all program behavior and invariants across either a single run of a program or multiple runs. In a statically-typed programming language, global type state includes classes, class members, types of members, parameters, and variables, as well as any other type or structural information expressed explicitly or implicitly in the program source code. Programs written in static languages are usually faster to execute than those written in dynamic languages because type information is fully specified in source code at compile-time, and optimized code is generated based on it. Additionally, because type state doesn't change at run-time in statically typed programs, run-time type checks to verify and detect current types of the program variables are not necessary. However, programmers sometimes prefer to use dynamically-typed languages rather than statically-typed languages for several reasons, such as increased flexibility and simplicity. One tradeoff to using dynamically-typed languages is that the aspects of the global type state can change, which makes the compilation of optimized code imprecise, and sometimes wasteful.
Automatic vectorization is a special case of parallelism where a compiler converts a program from a scalar form, which processes a single pair of operands at a time, to a vector form, which processes multiple pairs of operands at once using a single vector operation. The conversions happens in the intermediate representation of the program that the compiler maintains internally after parsing the high level source code (e.g., C, C++, Java, JavaScript) of the input program, and then finally generating machine code using vector instructions.
The compiler first analyzes the dependencies in its intermediate representation of the program to determine if it is safe to transform to the vector form. It then generates machine code by selecting the vector instructions present in the processor.
One of the requirements to perform vectorization is that the “type” of the variables that are grouped into a vector operand (e.g., the types of the different elements in an array) be the same and be statically determinable (e.g., completely known at compile time). This enables a uniformly packed (or a known pattern) data layout that becomes the vector operand and enables selection of the specific type of the vector instruction. But a challenge for performing vectorization for dynamically typed languages (e.g., JavaScript) is the “type” (e.g., “integer,” “floating point,” “string,” “character,” and “object”) of a variable/operand is not statically (at compile time) defined and can change during execution.