The present invention is directed to object oriented computer programming languages and, in particular, to a compiler which implements virtual inheritance in object oriented programs.
Object oriented computer programming (OOP) techniques for facilitating the development of complex computer programs are well-known and widely used. As understood by those skilled in the art, these techniques involve the definition, creation, use and destruction of xe2x80x9cobjects.xe2x80x9d These objects are software entities including both data elements and functions which manipulate the data elements. The data and related functions are treated by the software as an entity that can be created, used and deleted as if it were a single item. Together, the data and functions enable objects to model any real world entity in terms of its characteristics, which can be represented by the data elements, and its behavior, which can be represented by its data manipulation functions. In this way, objects can model concrete things, as well as abstract concepts, such as numbers or geometrical designs.
In an OOP programming language, objects are defined by creating xe2x80x9cclasses,xe2x80x9d which are not objects themselves, but act as templates that instruct the compiler how to construct actual objects which are xe2x80x9cinstancesxe2x80x9d of the classes. For example, a class may specify the number and type of data variables and the steps involved in the functions which manipulate the data. A corresponding object is actually created by a special function called a xe2x80x9cconstructorxe2x80x9d. The constructor uses the corresponding class definition and additional information, such as arguments specified during object creation, to create an object. Similarly, objects are destroyed by a special function called a xe2x80x9cdestructorxe2x80x9d when the objects are no longer of use.
The principle benefits of OOP techniques arise out of three basic characteristics: encapsulation; polymorphism; and inheritance. Data encapsulation refers to the binding of data and related functions. More specifically, an object can be designed to xe2x80x9chidexe2x80x9d (or xe2x80x9cencapsulatexe2x80x9d), all or a portion of its internal data structure and corresponding internal functions. For instance, during program design, a program developer can define objects in which all or some of the data variables and all or some of the related functions are considered xe2x80x9cprivatexe2x80x9d or for use by only the object itself. Other data or functions can be declared xe2x80x9cpublicxe2x80x9d or available for use externally of the object. External access to private functions or data can be controlled by defining public functions for an object which can be invoked externally of the object. The public functions form a controlled and consistent interface between the private data and the outside world. Any attempt to write program code which directly accesses the private functions or data causes the compiler to generate an error message during compilation and stop the compilation process.
Polymorphism is a characteristic which allows multiple functions that have the same overall purpose, but that work with different data, to produce consistent results. Inheritance allows program developers to easily reuse preexisting functions and to reduce the need for creating redundant functions from scratch. The principles of inheritance allow a software developer to declare classes (and the objects which are later created from them) as related. Specifically, classes may be designated as derived classes of other base classes. A derived class inherits and has access to functions of its base classes just as if these functions appeared in the derived class. Alternatively, a derived class can override or modify an inherited function merely by defining a new function with the same name. Overriding or modifying does not alter the function in the base class, but merely modifies the use of the function in the derived class. The creation of a new derived class which has some of the functionality (with selective modification) of another class allows software developers to easily customize existing code to meet their particular needs.
One widely used and well known OOP language is C++. The C++ language is classified as a hybrid OOP language, as opposed to a pure or orthodox OOP language. Because the C++ language was designed as an improvement to and as an extension of C, it is full of the traditional features of ANSI C. C++ source code is usually compiled before being executed. Therefore, the C++ programming process entails a development cycle of editing, compiling, linking, and running. Although the iteration through the cycle is a slow process, the produced code is very fast and efficient. The C++ language provides an excellent balance between power of expression, run time speed, and memory requirements. C++ compilers are commercially available from several vendors.
Inheritance may provide the most power to the class concept in OOP. Inheritance allows classes to be continually built and extended with essentially no limit. C++ is different from some OOP languages because it allows multiple inheritance.
To illustrate the concept of virtual inheritance, reference will be made to the class inheritance trees in FIGS. 1A and 1B. In FIG. 1A, class D directly descends from both base classes B and C and indirectly descends from class A. In this example, class D might appear to a compiler to have two distinct A classes appearing as base classes. Having multiple copies of the same base class in an inheritance tree in the compiled program is confusing and wastes storage space. To solve this problem, a base class may be declared to be virtual so that the compiler is directed to share a single copy of a given base class object in the derived class objects. A class inheritance tree using class A as a virtual base class is illustrated in FIG. 1B. Virtual inheritance, i.e. inheritance from a virtual base class, is a primary strength for improving space and run time efficiency of the C++ object model. FIG. 1C shows the resulting complete class D 10 corresponding to the inheritance tree of FIG. 1A, in which the base class A is not virtual. FIG. 1D shows the resulting complete class D 15 corresponding to the inheritance tree of FIG. 1B, in which the base class A is virtual. As shown in FIG. 1D, virtual base classes are only shared within a complete object, in this case the complete object D 15. Also shown in FIG. 1D is the virtual function table 16 for object D 15, indicating a virtual function 17, for example, contained within the virtual base class A.
To use virtual inheritance in a C++ program, the programmer must specify one or more of a class""s functions to be virtual. Typically, the complete set of virtual functions available is fixed at compile time and a programmer therefore cannot add or replace any function of the complete set at run time. Accordingly, fast dispatch of virtual function invocations is realized at the cost of run time flexibility. Virtual function calls are generally resolved by indexing into a table (conventionally known as a virtual function table) constructed by the compiler, which holds the addresses of the virtual functions associated with the base class. A fundamental problem of virtual inheritance is to dispatch within the constraints of the C++ object model conventions, the virtual functions at run time with the correct object pointer for the object that is being processed.
More specifically, this problem relates to properly obtaining a pointer which points to a derived class when given a pointer to a virtual base class. In the C++ language such a pointer is referred to as a xe2x80x9cthisxe2x80x9d pointer. The xe2x80x9cthisxe2x80x9d pointer must point to a location in the base class object that contains the function. As a result, adjusting functions are used to obtain a new xe2x80x9cthisxe2x80x9d pointer pointing to the derived class from a xe2x80x9cthisxe2x80x9d pointer pointing to a virtual base class. However, it is difficult to correctly obtain the new xe2x80x9cthisxe2x80x9d pointer because the virtual function may be shared by many interrelated classes having different class structures derived from the virtual base class.
A simple illustration for these terms is provided in FIG. 2A. Class A is the virtual base class for derived classes B, C, D, and E. Thereby, class D is a derived class of class B such that class A is a virtual base class to class B and class B is a virtual base class to class D. Also, class E is a derived class of class B, and also of class C. Because class A has been declared a virtual base class by the programmer, virtual function table pointers 210 are formed in class A which point to a virtual function table 220 associated therewith. The virtual function table 220 contains addresses corresponding to the functions 230 and 240 associated with class A. When the memory structure for the data structure of the virtual base class A is determined at compile time, memory space is set aside in class A for the virtual function table pointers 210, which will be initialized to point to the virtual function table 220, which in turn addresses the functions 230 and 240. The virtual function table 220 is used at run time to invoke the functions 230 and 240 associated with class A. As a result of virtual inheritance, the functions 230 and 240 may be shared by many different classes (in the present example these virtual function tables are shared by classes B, C, D, and E). However, in general, the virtual function table 220 and adjusting functions 250 and 260 may be different for each object of classes A, B, C, D or E.
At run time, when an object of class A has one of its functions called, and when that function is overridden within a derived class in the object, then a xe2x80x9cthisxe2x80x9d pointer which is passed to that overriding function must be obtained from information available via the xe2x80x9cthisxe2x80x9d pointer of the base class. For example, in FIG. 2A, if the function 230 is overridden in class B, then a call starting in class A must find a xe2x80x9cthisxe2x80x9d pointer for class B (i.e., a xe2x80x9cthisxe2x80x9d pointer 270) from information available via the xe2x80x9cthisxe2x80x9d pointer 200.
In FIG. 2A, the adjusting functions 250 and 260 are shown which provide the adjustment of the xe2x80x9cthisxe2x80x9d pointer 200. These adjusting functions 250 and 260 are small xe2x80x9cassembly stubsxe2x80x9d that obtain the correct xe2x80x9cthisxe2x80x9d pointer 270 for the call to the function 230 or 240 by offsetting from the available xe2x80x9cthisxe2x80x9d pointer 200, based on the actual layout of the complete object in memory. The adjusting functions 250 and 260 allow for the entries in the virtual function table 220 to remain simple pointers. The address contained within each entry of the virtual function table 220 directly addresses a function 230 or 240 when no adjustment is necessary; but the address addresses an associated adjusting function 250 or 260 when an adjustment of the xe2x80x9cthisxe2x80x9d pointer 200 is necessary.
Such an implementation of adjusting functions solves the aforementioned offset problem if the adjusting functions are constructed at compile time but creates a compatibility problem. In FIG. 2A for example, there are two different offsets between base class objects B and A that are dependent upon the configuration of the complete class D or E in memory. Different offsets may be necessary for the same class, as shown for class B in this example. Therefore, the correct adjusting functions for intermediate classes (classes having at least one virtual base class and being derived by at least one other class, such as class B in the present example) cannot be uniquely determined for cases where entry into a function is effected at the intermediate class.
FIG. 2B shows example memory layouts for instantiations of the complete objects E and D as shown in FIG. 2A. A first memory layout 400 for object E1 is shown having an offset 408 between class A and class B. A second memory layout E for object E2 is shown having an offset 410 between class A and B, and a third memory layout for object E3 404 is shown having an offset 412 between class A and class B. A memory layout 406 for object D is shown having an offset 414 between class A and class B. Thus FIG. 2B shows several possible offsets between class A and class B.
In one presently used solution to this problem, the adjusting functions for all of the possible class instantiations are built at compile time. As the classes are analyzed at compile time, all of the possible class offsets are determined then stored in a table. This table is accessed at run time to obtain the offset information during construction of objects. However, an extra parameter (the extra parameter being a table which points to another table having the locations of the adjusting functions) must be included in the object model used by this solution. As a result, this solution is incompatible with existing object models because this extra parameter will not be recognized by the compilers which follow the design conventions as suggested in the C++ annotated reference manual. Also because the adjusting functions are stored in object files on the system""s disk they must be brought off the disk and into memory, which is a relatively slow operation. Accordingly, this solution has the additional drawback of slowing down the speed of the compiled program at run time.
In another presently used solution, the adjusting functions are built at compile time by assuming that the class to be constructed is not a base class. Because it is not known how intermediate classes will be constructed at run time, an additional offset is provided for objects created from intermediate classes. Unfortunately, this offset does not work for all circumstances (e.g. cases having multiple interrelated base classes).
This problem of correctly calling virtual functions is recognized throughout the industry as an important problem to solve in facilitating the use of virtual inheritance in OOP languages as is readily seen by the large number of proposed attempts, which to date fail to completely solve this problem. In fact, a solution which provides the proper conversion for all class configurations including cases having multiple virtual base classes and virtual functions has yet to be implemented within the conventional object models. Therefore, a solution to this problem is desired that will always obtain a correct xe2x80x9cthisxe2x80x9d pointer and is compatible with the existing object models.
It is therefore an object of the present invention to generate the correct xe2x80x9cthisxe2x80x9d pointer to a derived object class when a virtual function is invoked on a base class object.
It is also another object of the present invention to generate a xe2x80x9cthisxe2x80x9d pointer in a manner which is compatible with the existing object model used by a compiler.
In accordance with the invention, the virtual function tables and adjusting functions are generated for some base classes at run time, when the offsets from the base classes to their derived classes are known. In particular, an object data structure is provided by a language translator, such as a compiler, which determines the memory structure at compile time for a plurality of object classes including at least one base class and at least one class derived therefrom. At compile time, space for pointers (b-pointers) is set aside in each base class object that will have a base table (b-table) associated therewith. The b-pointers point at run time to their associated b-table, which must contain memory offsets between the base class objects within the derived class object. At runtime, constructors construct the class objects, starting from the most derived class objects and proceeding through to the inner base class object.
However, instead of generating the virtual function tables and associated pointers, as well as the adjusting functions, at compile time, the compiler generates the code that will do the generation at run time. Then at run time, a virtual function table is generated for the base class. Since the correct offsets are known from the contents of the tables at this time, all of the adjusting functions, the virtual function b-tables, and the virtual pointers may be generated correctly. Thus, the system completes the construction of an object.
The adjusting functions, virtual function table, and virtual pointers for the most derived class may, of course, be generated at compile time as before. However, if they are generated at run time, the compiler is able to operate at a faster speed than in the conventional techniques which build these functions, tables, and pointers at compile time. In other words, the time necessary to generate these functions, tables and pointers at run time is less than the time necessary to retrieve them from the system""s disk if generated at compile time.
In addition, the generating process for the virtual function tables and the adjusting functions may share identical virtual function tables and adjusting functions to further reduce runtime overhead.