In the early 1990s, Borland introduced Delphi, an application development environment for the Windows operating system (OS). More recently, Borland introduced Kylix, bringing the Delphi toolset and environment to the Linux OS. As with Delphi, Object Pascal is at the heart of Kylix. Consequently, bringing Delphi to Linux required porting Object Pascal and all of its language features. One of the most challenging features to port involves exception handling.
In a language that supports exception handling, error conditions are raised (or “thrown”) as “exceptions,” which are propagated dynamically, as opposed to being passively returned as error results from functions. The method of propagation varies from system to system. Exceptions are an integral part of the implementation of Delphi and play an important role as a means of error recovery.
Object Pascal features a nonresumptive exception-handling model designed to be efficient enough to be used as the basis of error handling for a variety of applications. The syntax of exception handling is straightforward and the semantics are kept in control because the language is comparatively well designed, and exceptions are restricted in two ways: 1) the object must be thrown by reference; i.e., the object has to be created on the heap, and a reference (pointer) to that object is what is thrown; and 2) the object must be derived from the Object Pascal built-in type TObject.
In Object Pascal, there are two ways of raising a language exception and three ways of catching exceptions. As illustrated in FIG. 1 (where the value of <expression> must be a class instance derived from TObject), the first method of raising an exception is straightforward. The second method of raising an exception, however, is somewhat more complex and is described below.
The three forms used to catch exceptions have different semantics and usage. The first and simplest is try/except; as illustrated in FIG. 2(a). The try part of a try/except statement specifies the start of a block of one or more statements to be protected. The except part specifies the start of a block of one or more statements to be executed in the event an exception occurs while executing the statements in the try part. Within the except part, users can elect to use the second method of raising an exception, which is simply using raise without any expression following it. This causes the current exception to be re-raised from the current execution point, thereby allowing final error handling and disposition of the exception to be determined elsewhere.
A second form for catching exceptions uses the more specialized form of try/except, try/except/on, illustrated in FIG. 2(b). The statements in the except block only execute if the exception being raised is of the type TMyException, or derived therefrom. In this case, the programmer can define types of exception objects. The type of the object describes the type of exception. Consider a type named TbadDateValue and that some piece of code was processing a date value and found that it was not valid in a particular context. It would then raise an exception of the type TbadDateValue. except blocks that handle only those types of exceptions would encode the name of the type and would only be executed if an exception of that type passed their way. If the except block is executed, the variable E is bound to the exception object for the duration of the exception block. As mentioned, in Object Pascal, when an exception is raised, an object is created on the heap, and that is what is actually raised. In order to be able to use that object, the programmer has to be able to have a name to reference it within the program, so the syntax allows for declaring a variable (e.g. “E”) to be bound to that exception object for any particular exception clause in the program.
The third form for handling exceptions is the more specialized try/finally, illustrated in FIG. 2(c). try/finally differs from the other forms of exception handling statements in that the statements in the finally clause are always executed, even if no exception is thrown while executing the try block. This is useful for writing cleanup code that must always be executed for a function, to free resources, or to bring data structures back to sanity from a partial dissection.
The Delphi implementation of exception handling is relatively simple because Windows has a built-in mechanism for dispatching and unwinding exceptions. Delphi implements exception support via a stack-based chain of exception handlers, threaded through a global, per-thread list that is accessible by both the operating system and the client application.
A stack frame is an area of the stack memory allocated for a function invocation. The stack frame of a function contains the activation record for the function. When a function is called, the caller typically pushes some arguments onto the stack. The function being called then commonly allocates additional space on the stack to use as scratch space during its operation. When the function returns, that space, and the space used by pushing the parameters is returned to the stack. In an Intel x86 environment, a function establishes a stack frame using the frame pointer register (EBP) of the CPU. This operation, however, has added expense, and occupies the EBP register which could otherwise be used as a scratch register by the optimizer.
When unwinding an exception, the object is to properly free up those allocations made by all the functions in the call sequence that is currently live. For instance, given functions A, B and C, if A calls B, and B calls C, and C raises an exception to be caught by A, then the stack frames for both C and B have to be cleaned up; i.e., their space de-allocated, and the stack pointer returned to the location it was when A called B.
There are several advantages to the Windows scheme. First, all functions, including the operating system, use the same method for dispatching exceptions when hardware exceptions such as access violations occur. This enables nearly seamless behavior in the implementation of support for hardware exceptions, as opposed to language exceptions. Moreover, the global list of handlers, threaded through the FS segment, makes it efficient to dispatch exceptions. Stack unwinding is also easy because the records on the stack contain the information needed to reset the stack and frame pointers to the proper offsets for a given function in the event of exceptions. There is no need to inspect activation records of functions that do not have exception-handling constructs. Finally, the run-time installation of handlers laid out on the stack makes it easy for any language to participate in exception handling, including assembler code.
A significant disadvantage of the Windows scheme is the impact on performance of the non-exception case. Any function that contains exception-handling constructs must install a frame on the stack and thread the entry onto the per-thread list of handlers, even if no exception is raised during the execution of the function. By their nature, exceptions should be uncommon and not the normal case during the average execution of a function body. Therefore, the installation of the exception frame is a burden on the normal execution thread of most functions. This impact is most significant in the case of frequently called, small functions.
An additional disadvantage is the impact that the installation of the exception frame has on the optimizer. The compiler is hampered by the need to install this handler in the optimization of many functions, most notably small functions.
The Linux method for dealing with exceptions is radically different from that for Windows. For instance, there is no OS-defined mechanism for dispatching and unwinding exceptions. Language vendors are on their own with respect to how to deal with exceptions. On the Linux platform, most vendors use a program counter (PC)-mapped exception handling scheme. In a PC-mapped scheme, the compiler and linker work together to emit a static map of address ranges for functions, for blocks of code that are wrapped with exception handlers, and for their handlers. In addition, the compiler generates static data describing each function's stack layout to allow a run-time system to unwind the stack past an invocation of a given function. An external run-time library is provided by the language implementation to handle the dispatching and unwinding of exceptions.
The run-time library unwinds an exception by capturing the address from which the exception is raised, then looking-up the address in the PC map to identify the function from which the exception came. The run-time library thereby gains access to the per-function descriptors about the stack layout, and to the per-function ranges of potential handlers in the function for the exception. The stack layout descriptors are used to unwind the stack to the next caller, if need be, while the ranges of potential handlers are used to evaluate whether this function had any protected code that needs to have its exception handlers called in the unwind process.
The PC-mapped scheme described above has several significant advantages over the Windows scheme. First, there is no impact on the non-exception case because all the data that is emitted for dealing with exceptions is purely static data. Additionally, the absence of run-time stack frames for exceptions tends to make the optimizer's job easier for dealing with exception handling.
The disadvantages of the PC-mapped implementation are numerous, though they are mostly restricted to the language implementations. In some cases, they can be mitigated through standardization.
First, a PC-mapped scheme requires a more sophisticated and thus complex compiler. The compiler front- and back-ends require significant interaction to be able to emit the data required by the run-time library for unwinding stack frames. The linker is also implicated. Care must be taken to ensure that the static data generated by the compiler is position-independent, or prohibitive load-time hits can occur due to the relocation of exception-only data at application startup. Also, stack unwinding is much slower, because the static information for functions must be found and interpreted in order to unwind function frames, even for functions that do not contain exception-handling constructs.
Exception handling based on PC maps thus has the drawback that it adds to the runtime image of an application. For purposes of efficiency, it is necessary that such maps be small. Known implementations of PC-mapped exception handling currently available for Intel x86 architectures rely on more portable information for describing how to unwind function frames when propagating an exception up the call stack. This more portable information specifies the locations for registers that have been stored on the stack that have to be restored when a frame is unwound. It also specifies how to unwind the stack in esoteric cases of function frame layout. One choice for the format of this unwind information is a subset of the debug information format known as DWARF. A part of the DWARF specification provides a processor-neutral way of describing the location of various processor registers that have been saved on the stack during a function invocation. While portable, this information is bulky and can thus cause the size of applications to increase greatly. The GNU C++ compiler and the products based on it are typical of the state of the art.