Certain high-level languages, such as MATLAB, SETL and APL, enjoy immense popularity in domains such as signal and image processing, and are often the language of choice for fast prototyping, data analysis and visualization. MATLAB is a proprietary programming language due to The Math Works, Inc. The simplicity and ease of use of the language, coupled with the interactive nature of the MATLAB system makes it a productive environment for program development and analysis. However, MATLAB is slow in execution. One solution to the problem is to develop compilers that translate MATLAB programs to C code, and to then compile the generated C code into machine code that runs much faster than the original MATLAB source. This approach is a difficult one, primarily because the MATLAB language lacks program declarations. Therefore, any automated approach would have to first contend with the problems of automatic type and shape inferencing.
Array shape inferencing refers to the problem of deducing the dimensionality and extents of an array's shape at compile time. Shape inferencing in languages such as MATLAB is a difficult task, primarily due to the fact that such languages do not explicitly declare the shape of an array. On account of the dynamic binding of storage to names and the run time changes in basic data type and shape that these languages allow, interpreters are typically used to cope with their translation. Hence, shape inferencing is desirable from the standpoint of program compilation, since inferred shapes enable compile-time array conformability checking, memory preallocation optimizations, and efficient translations to “scalar” target languages.
When the shape of a MATLAB program variable is not statically determinable, researchers have usually approached the problem by generating code that performs the inference at execution time. This code relies on ancillary variables called shadow variables that the compiler generates. The methodology is described in Luiz Antonio De Rose's Ph.D. dissertation titled Compiler Techniques for MATLAB Programs, and in the journal paper titled Techniques for the Translation of MATLAB Programs into Fortran 90 by Luiz Antonio De Rose and David A. Padua. Both of these works are incorporated by reference herein. Though such an approach is robust, it does not offer an opportunity for propagating an expression's shape across statements, when the expression's shape is unknown at compile time. That is, once shadow variables are introduced, useful shape information that could otherwise be propagated across expressions gets obscured.
Previous attempts at automated approaches to inferencing revolved around the type determination problem. These were based on special mathematical structures called lattices. These structures are described in standard texts on discrete mathematics. Among the first of these attempts was type inferencing work by Marc A. Kaplan and Jeffrey D. Ullman. In a paper titled A Scheme for the Automatic Inference of Variable Types, which is hereby incorporated by reference, they proposed a general mathematical framework based on the theory of lattices that automatically inferred the types of variables in a model of computation that was an abstraction of programming languages such as APL, SETL and SNOBOL. Though the Kaplan et al. procedure can be carried over to MATLAB in a straightforward manner to also solve the problem of type inferencing, the same cannot be said as far as shape inferencing is concerned. For the Kaplan et al. approach to work, the type functions that model the type semantics of the language's operators must be monotonic with respect to the defined lattice. For some of MATLAB's built-in functions such as matrix multiply, it can be shown that the shape-tuple function that models the operation's shape semantics will not be monotonic with respect to any lattice that can be defined on the set of shape-tuples. Thus, existing lattice-based techniques have only limited scope for array shape inferencing in MATLAB.