The present invention relates generally to the field of computer systems and, more particularly, to systems and methods for compiling source programs, particularly object-oriented ones, into optimized object code.
Before a digital computer may accomplish a desired task, it must receive an appropriate set of instructions. Executed by the computer's microprocessor, these instructions, collectively referred to as a "computer program," direct the operation of the computer. Expectedly, the computer must understand the instructions which it receives before it may undertake the specified activity.
Owing to their digital nature, computers essentially only understand "machine code," i.e., the low-level, minute instructions for performing specific tasks--the sequence of ones and zeros that are interpreted as specific instructions by the computer's microprocessor. Since machine language or machine code is the only language computers actually understand, all other programming languages represent ways of structuring human language so that humans can get computers to perform specific tasks.
While it is possible for humans to compose meaningful programs in machine code, practically all software development today employs one or more of the available programming languages. The most widely used programming languages are the "high-level" languages, such as C or Pascal. These languages allow data structures and algorithms to be expressed in a style of writing which is easily read and understood by fellow programmers.
A program called a "compiler" translates these instructions into the requisite machine language. In the context of this translation, the program which is written in the high-level language is called the "source code" or source program. The low-level or machine language, on the other hand, comprises "object code." Once created, object code (e.g., .obj file) is a separate program in its own right--it includes instructions which may be executed by the target microprocessor. In practice, however, the object code is usually first linked (i.e., combined) with other object code or libraries, which include standard routines.
Compilers are fundamental to modern computing. Translating human-oriented programming languages into computer-oriented machine languages, compilers allow computer programmers to ignore the machine-dependent details of machine language. Moreover, high-level languages are "portable," a feature which permits a single program to be implemented on several different machines, including ones of vastly different architecture. In this instance, the source program is "ported" (transferred) from one machine to another with little or no revision; instead, the program is simply re-compiled for each target machine. Thus, compilers allow programs and programming expertise to be machine-independent.
A compiler performs two basic tasks: analysis of the source program and synthesis of a machine-language program which instructs the computer to perform the task described by the source program. Most compilers are syntax driven, i.e., the compilation process is directed by the syntactic structure of the source program, as recognized by a compiler's parser. The parser builds the structure out of tokens, the lowest-level symbols used to define a programming language's syntax. This recognition of syntactic structure is a major part of the analysis task. Semantic routines actually supply the meaning (semantics) of the program, based on the syntactic structures. The semantic routines generate the target code or, optionally, some intermediate representation thereof.
Ideally, when a compiler translates a description of an application and maps it onto the underlying machine-level instruction set of a target processor, the resulting code should be at least as good as can be written by hand. In reality, code created by straightforward compilation rarely achieves its goal; instead, tradeoffs of slower performance and/or increased size of the executing application are usually incurred. While compilers simplify the task of creating meaningful programs, they rarely produce machine code which is not only the most efficient (smallest) in size but also executes the fastest.
Object-oriented programming languages (OOPL), such as C++, entail even further difficulties. In particular, data encapsulation, inheritance, and polymorphism--the main advantages of OOPL--all increase the difficulty of implementing optimizing techniques. As a result, optimization efforts to date have been largely restricted to straight procedural (e.g., C) compilers.