Generally speaking, a translator is a computer program that receives as its input a program written in one computer programming language and produces as its output a program in another programming language. Translators that receive as their input a high-level source language (e.g., C++, JAVA, etc.) and generate as their output a low-level language such as assembly language or machine language sometimes are more specifically referred to as compilers. The process of translation within a compiler program generally consists of multiple phases. FIG. 1 illustrates a flow chart showing one such break down of the multiple phases of a compiler. The source program representation in source code is received at 110. Then at 120 the lexical analyzer separates the characters of the source code into logical groups referred to as tokens. The tokens may be key words in the syntax of the source language such as, IF or WHILE, operators such as, + and −, identifiers and punctuation symbols. At 130, the syntax analyzer groups the tokens together into syntactic structures such an expression or a statement. At 140, an intermediate representation (IR) of the source code, including the exception handling constructs, is generated to facilitate compiler back end operations such as code optimization at 150 and then code generation at 160. There can be multiple intermediate representations within a compiler process. During the code optimization phase 150 various techniques may be directed to improving the intermediate representation generated at 140 so that the ultimate object code runs faster and uses less memory. During the final phase at 160, the code generator produces the target program (object code) 170 to be executed by a processor.
Exception handling is invoked when a flaw in the source program is detected. In the existing compiler frameworks, exception handling constructs within the source program are processed separate from the main control flow of the intermediate representation. Traditionally, exception handling constructs are not explicitly represented in the control flow of the intermediate representation. In one well known technique, regions within the source code where exception handling constructs are detected are delimited from the main control flow and thus not subject to the same code optimization techniques as the main control flow. In yet another method, the exception handling constructs are captured within a table outside of the main control flow and the compiler back end processes them separately. Thus, there is a need for intermediate representation for exception handling constructs that allows such constructs to be explicitly represented within the main control flow to take advantage of the same code optimizations and code generation techniques (i.e., compiler back end) as the rest of the source code.
Also, traditionally, intermediate representations have been specific to a source language. Thus, compilers have to be aware of the specific exception handling models of the source language associated with each representation. For our purposes, these exception handling models can be typically characterized by four features. The first feature determines if the exception is synchronous or asynchronous. A synchronous exception is associated with the action of the thread of control that throws and handles it. In this situation, an exception is always associated with an instruction of the thread. In other words, an exception handling action is invoked by an instruction when some condition fails. However, an asynchronous exception is injected into a thread of control other than thread that may have thrown and handled it. In Microsoft CLR (the Common Language Runtime (CLR) is Microsoft's commercial implementation of the Common Language Infrastructure (CLI) specification; Microsoft is a trademark of Microsoft Corporation), this may be caused by aborting a thread via a system API. Such exceptions are not associated to a particular instruction. The effect is to raise an exception in the thread at some suitable point called a synchronization point.
Second, an exception may either terminate or resume the exception causing instruction. In the case of a terminating exception the instruction is terminated and a filter, handler, or a finalization action is initiated. However in the case of a resumption model the offending instruction can be automatically resumed after some handling action is performed. The Structured Exception Handling (SEH) constructs in C/C++ fall into this category. This requires, typically, that the entire region including the exception causing instruction be guarded as if all memory accesses act like volatile accesses. Thus, disallowing any optimization of the memory accesses.
Third, an exception handling model may be precise or imprecise. In precise exception handling models relative ordering of two instructions needs to preserve observable behavior of memory state. This means that a reordering of instructions cannot be performed if a handler or another fragment of code will see different values of variables. Languages such as C#, Microsoft CLR and C++ require a precise mechanism. In such models, the compiler may need to reorder exception instructions relative to each other and any other instruction whose effect is visible globally. In imprecise models, the relative order of instructions on exception effect is undefined and a compiler is free to reorder such instructions. In either model, the order between exception instructions and their handlers is always defined and is based on control dependencies. Some languages like Ada have an imprecise exception model.
Fourth feature of an exception handling model is how handler association is performed in various exception handling models. In most languages, including C++, C#, and Microsoft CLR, handler association is lexical and performed statically. This means that it is statically possible to identify the start of the handler code and this is unique. As explained below this attribute of statically identifying handler bodies may be used to generate the intermediate representation of the exception handling instructions. Thus, there is a need for a single uniform framework for intermediately representing exception handling constructs that is uniform across multiple models for representing exception handling and is capable of accounting for the various attributes of such models described above.