In general, compilers translate human readable source code into machine readable object code in preparation for execution. A compiler is designed to receive human readable source code in an expected format. When the source code is written in a format not expected by the compiler, then compilation aborts and an exception is generated to identify the unexpected format. There are many causes of compilation errors known in the art. If compilation aborts, the object code is not available to be executed.
Traditionally, source code is compiled during the software development stage when software developers are still present to repair any errors in the program. Once the errors are repaired, the source code is compiled into object code (also known as native code or machine code) for a target platform and tested. After testing, the object code can be utilized by any compatible system. Software developers often develop object code for different systems and platforms.
A number of terms are commonly used in the computer arts to describe the process of transforming a computer program into a machine usable format. The following terms present a broad and general overview of the methods known in the arts to perform this transformation.
Compilers. High level language compilers take human readable source code and translate it into machine readable object code. Traditionally, compilation of the entire program was completed before any portion of the program was executed. A compiler often takes human readable source code as input and creates machine executable code.
Interpreters. Generally, an interpreter is a program that translates a program from one language to another and then typically a short time thereafter executes the translated code. In some cases the code translated by the interpreter is not cached (or saved in memory). In those cases, if an already executed section of code needs to be executed again, it must be reinterpreted.
Virtual Machines. The term virtual machine carries several meanings. Virtual machine is used to describe a run-time computing environment that enables a program to run in a computing environment other than the computing environment for which the program was originally designed. For example, a 32-bit operating system server could present a runtime virtual machine environment to run non-native 16-bit applications. The term virtual machine is also used to describe the Java run-time environment. The Java run-time environment is based on an earlier model language called UCSD Pascal. UCSD Pascal takes Pascal source code and translates it into an intermediate language code that can be further interpreted (translated) at run-time into object code and executed in the native environment. The intermediate language code is usable by any computer with an interpreter capable of translating the intermediate language code into a native object code. Such an interpreter is often called a virtual machine.
Just-in-time (JIT) Compilers. Just like a traditional compiler, a JIT compiler translates a program from a first language to a second language. However, a JIT compiler often delays the translation until the section or method of the program in the first language is needed for execution in a second language. The decision of exactly when to translate a section of code from a first language into a second language may vary broadly. Preferably, the code will be translated in time for it to be consumed by the program or processor expecting code in the second language.
The term JIT compiler is generally used in the context of a two phase translation process. The first phase of translation starts with an initial language representation and translates it into an intermediate language representation. The second phase of translation starts with an intermediate language representation and translates it into a subsequent language representation. Often, the JIT compiler is associated with the second phase of translation where the intermediate language code is translated into a more machine friendly language representation. If the machine friendly language representation is object code, then the object code will be used directly by the processor. However, if the output of the JIT compiler is an intermediate language code, then it must be further translated (e.g., by a virtual machine or an interpreter) before it can be utilized by the native system. Often, JIT compilers translate intermediate language code into native code once, and then store (or cache) the translated code in case sections of code are run multiple times.
Translators. A translator is generally a program that translates a program from one language into another. The second language representation of the program can be either an intermediate language or machine code. Translation is generally utilized in order to transform a program along the spectrum from a more human readable format into a more machine friendly format. For this specification, compilers, JIT compilers, translators, interpreters, assemblers, virtual machines and any other automated program and or process that translates from a first language representation to a second language representation will be called a translator.
The process of translation is typically so complex that it is often logically (and potentially physically) broken down into a series of sub-units. A partial list of such logical sub-units would include lexical analysis, syntactical analysis, intermediate code creation, code optimization, code generation, table management, type checking, code verification and error handling. A more complete enumeration of the potential translation sub-units are well known in the arts. For each of the potential sub-units employed to translate a program from a first language (e.g., source code or intermediate code) to a second language (e.g., intermediate code, object code, or interpretable code), something can go wrong with the process. When a sub-unit of translation aborts translation because it is unable to resolve the meaning of an instruction(s), then that instruction is deemed an unresolvable instruction(s) (or unresolvable code). (See below Unresolvable Translation Errors). If the translator is unable to resolve the translation in a meaningful or unambiguous way, then translation aborts.
For many reasons including interoperability and security, intermediate language code has been gaining popularity. Intermediate language code is a representation of a program that was translated from an initial language code representation of the program. Further, the intermediate language code representation of the program will be translated into a subsequent language code representation of the program some time before the program is executed. Often, source code is translated into an intermediate language code in a first translation. This intermediate language code is then translated into a native object code in a second translation. The first and second translation can occur on the same or different computers. Potentially, a program may be translated into different intermediate language code representations in multiple successive intermediate phases of representation. Any computer that has a translator capable of translating any phase of intermediate language code into a utilizable subsequent language code, can use the functionality of the program represented by the intermediate language code.
Developers can author a source code program once, translate it into intermediate language code and distribute it to any platform with an intermediate language code translator. A universal intermediate language code is efficient since it is easier for each computer to support one or two (or a few) intermediate language code translators than to support multiple different (traditional) source code compilers. Further, by the time a program reaches an intermediate language code phase, much of the difficult or time consuming analysis may already have been computed by a prior translating phase, thereby reducing the workload for a subsequent translator.
However, multiple stage translation creates a potential new set of problems. In subsequent translations, if a next translator throws a translation exception, the source code developers may no longer be present to fix any translation errors. So if the intermediate language translator aborts translation, the program represented by the intermediate language code will not be usable. The intermediate language code is useless even if the section of unresolvable instruction(s) that caused the translation to abort would never have been executed in a specific case after translation. It is not unusual for large portions of a computer program to not be executed during any individual execution case.