When a programmer develops a computer program, the source code of the program typically accesses many functions and variables within the program. These accesses are expressed in the source code as mere references to the names of the functions or variables. However, some time before the functions and variables can be accessed, the name must be bound to the memory location where either the entry point of the function or the data for the variable resides. This binding can be performed in two ways: statically or dynamically. The phrase "static binding" refers to binding the name to the memory location during compilation or linking. In contrast, the phrase "dynamic binding" refers to binding the name to the memory location at runtime.
The technique of dynamic binding has become quite popular in object-oriented programming languages. When developing a program in an object-oriented language, a programmer typically creates a number of objects whose interactions perform the functionality of the program. An "object" contains both data and behavior. The data is stored as data members of the object, and the behavior is performed by function members or methods of the object. Dynamic binding has become popular in object-oriented languages because dynamic binding provides great flexibility to the programmer. For example, dynamic binding facilitates polymorphism, where a function name may denote several different function members depending on the runtime context in which it is used. It is common in object-oriented programming languages for one function name to be used by more than one function. In fact, this feature is basic to most object-oriented programming languages.
One function name may refer to more than one function because each object in a program is based upon a class definition (i.e., an object is an instance of a class). Classes are typically linked together to form a class hierarchy, where one class, a derived class, may inherit the data members and function members of another class, a base class. In such situations, the derived class may either choose to use the implementation of the function members provided by the base class, or it may choose to override the function members. When overriding a function member, the derived class defines its own implementation for that function member using the same function name. After the derived class overrides a function member of a base class, when objects of type "derived class" call the function member, the implementation of the function member provided by the derived class is invoked. Conversely, when objects of type "base class" call the function member, the implementation of the function member provided by the base class is invoked.
Although it provides flexibility when programming, dynamic binding can detract greatly from runtime performance if not performed efficiently, for the overhead associated with dynamic binding is incurred each time a function member is invoked. Therefore, if the dynamic binding scheme employed is inefficient, performance of the overall program may degrade substantially.
One conventional scheme for performing dynamic binding, known as in-line caching, is fairly efficient. In the in-line caching scheme, each function member has two entry points: a verified entry point and an unverified entry point. The verified entry point provides access to the actual code of the function member, the code developed by the programmer. The unverified entry point, on the other hand, provides access to system-provided verification code used to verify that the caller actually intended to invoke this function member as opposed to a different, similarly named function member having a different implementation. In the in-line caching scheme, a function call (e.g. "foo") by a caller object comprises two instructions as shown below in Intel I486 pseudo-code:
CODE TABLE 1 ______________________________________ move eax, class call unvenfied.sub.-- entry.sub.- point class.foo ______________________________________
In the above code table, "class" is an identifier of the class of the last object to have invoked the foo function member via this code. That is, objects of different types may have used this code to invoke the function member, and the "class" identifies the class of the most recent object to have done so. The move instruction moves the "class" identifier into the eax register of the computer, and the call instruction accesses the unverified entry point of the foo function member for the identified class. This call instruction is made in the hope that the class of the caller object is the same as the class contained in the eax register to save processing time, which will be further described below.
To explain the in-line caching technique more completely, FIG. 1 contains a flowchart of the steps performed by a conventional in-line caching technique. Specifically, this flowchart depicts the steps performed by a caller object when executing a function call to invoke a function member, a server function. When executing the function call, the caller object executes instructions like the ones contained in Code Table #1. The first step performed by the caller object is for the caller object to execute a move instruction to move the class of the last object that used the function call to invoke this server function into the eax register (step 102). This instruction has been described above.
Next, the caller object calls the unverified entry point of the server function (step 104). This instruction has also been described above. When this instruction is executed, the unverified entry point of the server function is accessed and the verification code located at the unverified entry point is executed as reflected by steps 106, 110, 112, 114, and 116. This verification code determines if the server function is the correct function member to be invoked by the caller object. If not, it determines the correct function member, modifies the code of the function call (contained in Code Table #1) so that the correct function member is invoked for subsequent invocations of the function call, and then invokes the correct function member.
The first step performed by the verification code of the server function is to determine if the appropriate class is in the eax register (step 106). The class in the eax register is the class of the last object to have used the function call to invoke this function member. As such, it can be ensured that the server function is the appropriate function to be invoked for all objects of the class contained in the eax register. In determining if the class in the eax register is the appropriate class, the verification code of the server function compares this class with the class of the caller object. The class of the caller object is passed into the server function as a hidden parameter, and in this step, the server function uses this parameter for the comparison. If the appropriate class is contained in the eax register, then the server function has determined that it is the correct function to be invoked and the instructions contained within it are then executed (step 108). These instructions are located at the verified entry point of the server function. The instructions executed in this step are the actual code developed by the programmer for the server function.
Otherwise, if the appropriate class is not in the eax register, the verification code of the server function accesses the hidden parameter, indicating the class of the caller object, and utilizes a look-up function to locate the appropriate function for objects of this class (step 110). The look-up function is a system-provided function. After locating the appropriate function, the verification code then changes the code of the function call reflected by step 102 to refer to the class of the caller object so that this class will be moved into the eax register the next time the function call is invoked. and the verification code also changes the code of the function call reflected by step 104 so that it will invoke the unverified entry point of the appropriate server function the next time it is invoked (step 112). The verification code of the server function then stores the appropriate class in the eax register (step 114) and invokes the verified entry point of the appropriate server function (step 116). The verified entry point, at which the main code of the function member is located, may be invoked because it has been determined that this function member is the appropriate one for the caller object. After executing the appropriate function member, processing returns.
Although changing the code to refer to the appropriate function member is a necessary part of the in-line caching scheme, it takes a significant amount of processing time, because the instruction cache of the CPU has to be flushed. In some CPU architectures, like the Intel Pentium architecture, the CPU maintains an instruction cache containing instructions prefetched from main memory to reduce the number of main memory accesses. When an instruction changes, the cache is no longer valid, so it has to be flushed and main memory must be accessed to fill the cache again. Both flushing and filling the instruction cache take a great deal of processing time.
Although in-line caching is a fairly efficient way of performing dynamic binding, it performs a significant amount of unnecessary processing. For example, the verification code of a function member needs to be invoked only when the function name of that function member is ambiguous, referring to more than one function member within the class hierarchy. When there is only one function member of a given name in the class hierarchy, all references to this function member name can only refer to that one function member--no ambiguity exists. In this situation, invoking the verification code causes wasteful processing. In response to this observation, one conventional system has implemented a hybrid approach to dynamic binding that uses both static and dynamic binding. Using this approach, static binding is utilized for function members that are unambiguous, and at runtime when a statically bound function member becomes ambiguous because it has been overridden, this system switches to dynamic binding. As a result, this system reduces the unnecessary invocation of the verification code when a function is unambiguous.
This hybrid system is implemented in two parts. The first part of the system is implemented in the compiler, and the second part of the system is performed by the program at runtime. When compiling a function call, the compiler, as shown in FIG. 2A, first determines if this is an unambiguous function call (step 200). The compiler makes this determination by examining the class hierarchy to determine if there are any function members with the same name. If the function call is unambiguous, the compiler compiles the code using static binding (step 202). In this situation, the compiler compiles the source code of the function call into a call into the verified entry point of the function member. If, on the other hand, the function member is ambiguous, the compiler compiles the function call so as to use the in-line caching scheme described above (step 204).
After the program has been compiled, the program may be executed, and during execution, as shown in FIG. 2B, the system determines when a system-provided function has been invoked to load a class (step 210). The phrase "loading a class" refers to an object being instantiated based upon a class definition. Such a creation of an object may override a function member in the class hierarchy, thus making the function member ambiguous. If the system determines that a class is being loaded, the system determines if a statically bound function member becomes ambiguous (step 212). The system makes this determination by determining if a function member of the loaded class overrides an existing function member in the class hierarchy that was compiled in step 202 of FIG. 2A to use static binding. If this condition is true, the system recompiles the code for the function call to switch to in-line caching (step 214). In this step, the function call code has to be recompiled to add the instructions shown above in code table #1.
Although this hybrid system saves the needless invocation of verification code, it introduces a heavy burden on the system: the recompilation of code to switch between static binding and dynamic binding. Having to recompile the code causes the instruction cache to be flushed, main memory to be accessed, and the code to be parsed and generated. It is therefore desirable to improve hybrid dynamic-binding systems.