1. Technical Field
This invention relates to the implementation of object oriented languages that are statically compiled into platform-specific object code. More specifically, it involves detecting when changes to a class require clients of that class to be recompiled due to compiled-in assumptions about the changed class that the compiler has generated in the object code for the client classes.
2. Prior Art
Object oriented languages include the C++ and Sun Microsystems, Inc.""s Java programming languages. Java has been generally available on most widely used general computing systems since about 1996. It was originally intended for use as an interpreted language which could be used to program World Wide Web browsers. Lately, it has been gaining acceptance as a programming language for server applications. In this domain, performance of the application is critical. This motivates the need for optimizing static compilers that take Java source or bytecodes and generate highly optimized, platform-specific object code. The IBM(copyright) Visual Age(copyright) for Java High Performance Compiler (HPC) is such a compiler.
Typically, with respect to time, compile time is followed by execution time (which is also referred to as run time). Execution, or run, time includes load, initialize, execute and terminate times or processes. A programmer typically codes in source code, to produce a plurality of .c or .java files. During compile time, a compiler compiles these into object code files which are then linked into executable files in an application or library. Typically, many object files are linked into a much smaller number of executable files, referred to as a dynamically loaded library (DLL).
The problem addressed by the invention occurs when a programmer goes through the typical processes of editing, compiling, debugging, changing, and recompiling. Because of the large number of source files involved, it is desired to recompile only those source files which are changed, to regenerate only the object file corresponding to that changed source file. However, it is also the case that during compilation of one source file, the compiler makes assumptions about other source files to which the one source file refers. If, by virtue of the recompilation of the new source file, the assumptions in old source files are no longer valid, a release-to-release-binary incompatibility results.
Release-to-release binary compatibility (RRBC) in object oriented languages refers to the ability to modify a class without having to recompile all of the other classes in the application that refer to this class.
Compilers for statically compiled object oriented languages such as C++ embed assumptions about the layout of compiler-generated data structures for the referent classes in the compiled code of each class. These assumptions usually refer to the layout of fields in instances of the referent classes and the method tables of the referent classes. In general, the assumptions involve the particular indices of fields and methods within these structures. Since the assumptions are all made by the compiler, the user usually has no idea which classes need recompilation when a particular class has been modified, or in the case of object libraries, what those classes may even be.
Release-to-release binary compatibility has been implemented for statically compiled object oriented languages by developing object run time systems such as System Object Model (SOM). These run time systems specify the object models for which the compiler must generate code. RRBC is usually provided by adding an extra level of indirection into method dispatch and field access code generation. Typically, a compiler-generated static temporary is created (that is, has storage allocated) to hold the appropriate offset or table index, which is initialized by the object run time system to the appropriate value. Since method dispatch and field access tend to be frequent operations in object oriented programs, such an implementation imposes a considerable performance penalty on the application.
U.S. Pat. No. 5,339,438 (Conner et al.) for Version Independence for Object Oriented Programs describes a method for implementing RRBC in IBM""s System Object Model (SOM). SOM involves using static temporaries initialized at application load time to hold the offsets or sizes that may change from version to version of a referent class. By adding an extra level of indirection to instance field references and instance method invocation, SOM implements RRBC. Introducing an extra level of indirection to implement full RRBC comes at a significant performance cost, and there is a need in the art for a high performance method for detecting RRBC violations without user intervention.
Another contemporary approach provides a technique for detecting whether a particular release of an operating system can correctly execute a program that is built with a higher (later) release of the same operating system (this is called downward compatibility). Each program contains a compatibility level indicator. The value of the indicator is determined by the compiler that generates the object code of the program by determining which instructions are used by the program. Any instruction is supported in a particular release and all subsequent releases of the operating system. The highest such release number among all instructions used by the program is the compatibility level indicator for the program. When the operating system loads the program for execution, it will only execute the program if the compatibility level indicator is less than or equal to its own release level. This method assumes a linear progression of compatibility. That is, a compatibility level of N implies that the program can be executed on all operating system release levels greater than or equal to N, and on no operating system level less than N. There is a need in the art for a solution to the problem of changing field and method tables in which this property (linear progression of compatibility) does not necessarily hold. Furthermore, there is a need for a method which allows for the separation of aspects of compatibility (i.e., if the field tables of a class change but the method tables do not, then only those client classes which require access to the field tables of the class will fail a run time signature check-that is, a check at run time initialization of a signature generated at compile time).
U.S. Pat. No. 5,768,588 (Endicott, et al.) for Efficient Method Router That Supports Multiple Simultaneous Object Versions describes the implementation of the New Object Model (NOM), which is the underlying object data structures and run time support that can be used to implement object oriented languages such as Smalltalk and C++, in particular, in an interactive environment. Objects in NOM contain a pointer to an interface table. The interface table contains a number of tuples, one for each class in the inheritance hierarchy for the class of the object. Each tuple contains a class signature to identify the class at that level in the hierarchy, and a pointer to a method table for methods of that class. A method invocation contains an object identifier, a level number, a call signature and a method table offset. The object identifier is dereferenced to obtain the interface table for the object, and the level is used as the index into this table. The call signature is checked against the class signature in the tuple found at this entry in the interface table. If they do not match, the program is aborted. If they do match, the method pointer is obtained by indexing the method table pointed to by this tuple with the method table offset in the call. Since NOM implements full RRBC (like SOM), the call signature is not used to detect RRBC violations. Instead it is used to check that the class hierarchy of the callee did not change from the time that the call was compiled (i.e., the class that is assumed to be at a particular level in the inheritance tree of the callee is in fact there at execution time). The NOM solution to implementation of full RRBC comes at a high cost in space and time. This NOM solution requires this check to be done dynamically at each method call, thus slowing down every method invocation. Extending this to RRBC checking would imply that it is done at each method invocation and field access, both of which are very common operations in object oriented programs. The NOM solution also requires a signature to be generated for each call site in the program, which when executed to RRBC checking would require a signature for each call site and field access. There is a need for a solution that has neither of these shortcomings. That is, there is a need in the art for a binary compatibility checking method that is done once per referenced class-assumption pair, at class loading time, thus incurring a fixed overhead which can be amortized over the entire execution time of the application. Also, for a method where common checks are performed only once for the class, not once per call site or field access.
Thus, it is desirable and advantageous to have an improved system and method for release-to-release binary compatibility (RRBC) checking which can enable improved execution time performance. If is also desirable and advantageous to have RRBC checking that has only initialization-time (i.e., load time) cost, and does not slow down field access or method invocation. Further, it is desirable and advantageous to have a system for RRBC checking that can provide better space utilization, optionally achieved through factoring into one check assumptions about an aspect of a particular class (i.e., same repeated checks are not performed). It is also desirable and advantageous to have a system and method for RRBC checking which does not assume a linear progression of compatibility between versions and does not require source code to determine compatibility. Moreover, it is desirable and advantageous to have a system and method for RRBC checking in which all signatures are embedded into compiler-generated binary structures and does not require the user to provide version informationxe2x80x94it is done without user input. It is also desirable and advantageous to provide a way during run time to check all object files that have been bound (i.e., linked) together to determine if the assumptions made in each object file are still valid and for establishing dependencies at the assumption level, as distinguished from the class level.
The invention provides a system and method for detecting binary compatibility in compiled object code. In accordance with the system of the invention, a referring class metadata store includes a class structure table for storing during compilation of the referring class at least one signature indicia assumed by the referring class with respect to table contents in a referent class. In accordance with the method of the invention, during compilation of the referring class, characterizing indicia for a referent class is encoded into class metadata for the referring class. During initialization at run time, the characterizing indicia in the metadata of the referring class is checked for correspondence with referent class metadata.
There is provided a method for detecting binary compatibility in compiled object code, comprising the steps of generating a signature for each of one or more structures, comparing the signatures for corresponding structures at compile time and responsive to said signatures not comparing equal, signaling incompatibility. There is also provided a method for detecting binary incompatibility in object code, comprising the steps of, during compilation of a referring class, encoding characterizing indicia for a referent class into class metadata for said referring class, during run time processing, checking said characterizing indicia for correspondence, and responsive to lack of correspondence, emitting an error message. The above method may also further comprise the step of encoding as said characterizing indicia at least one signature selected from the set comprising a field block table signature, an instance method table signature, an instance data signature, and a method block signature. The method may also comprise the step of calculating said field block table signature as a function of the indexes of one or more static field entries in a field block table, the step of calculating said method block table signature as the index of a static method in the method block table of said referent class, the step of calculating said instance data signature as the offset of an instance field in an instance of said referent class, or the step of calculating said instance method table signature as the offset of an instance method in the instance method table of said referent table.
The above methods may also further comprise the steps of processing a relocation by checking an assumed signature in the relocation table of said referring class with an actual signature in the referent class, responsive to said signatures matching, continuing execution, and responsive to said signatures not matching, emitting an error message and aborting the application. Said characterizing indicia may also include at least one assumption made about the order that named entities appear in a table in said referent class. And, the step of generating said characterizing indicia as a function of a character stream may include ordered references by name to bytecode entities providing a unique representation of said table. Further, said character stream may be encoded using a digital signature. And, the above methods may further comprise the step of encoding said characterizing indicia when compiling getstatic and putstatic bytecodes for said referent class in a different dynamic loaded library from said referring class.
Also, there may be provided an above method further step of encoding said method block table signature when compiling an invokestatic bytecode in the case where said referent class is in a different dynamic loaded library from said referring class, and encoding said method block table signature when compiling an invokeinterface bytecode. Also, an above method may further comprise the step of encoding said instance data signature when compiling getfield and putfield bytecodes for said referent class, encoding said instance data signature when generating an instance field layout for said referring class, and encoding said instance data signature for an implicit referent class which is a superclass of said referring class. And, an above method may further comprise the step of encoding said instance method table signature when compiling the invokevirtual and invokespecial bytecodes, and encoding said instance method table signature when generating the instance method table of said referring class and implicit referent class is a superclass of said referring class A.
There is also provided a program storage device readable by a machine, tangibly embodying a program of instructions executable by a machine to perform any of the above method steps for detecting binary compatibility in compiled object code.
There is also provided a computer system for detecting binary compatibility in compiled object code, comprising a referring class metadata store; a referent class metadata store; said referring class metadata store including a class structure table for storing during compilation of a referring class at least one signature indicia assumed by said referring class with respect to table contents in a referent class; and a comparator for comparing said signature indicia with a class structure table signature in said referent class metadata store. The above computer system may also be provided wherein said signature indicia being selected from the set comprising a field block table signature, an instance method table signature, an instance data signature, and a method block signature.
There is also provided an article of manufacture comprising a computer useable medium having computer readable program code means embodied therein for detecting binary compatibility in compiled object code, the computer readable program means in said article of manufacture comprising computer readable program code means for causing a computer to effect generating a signature for each of one or more structures, computer readable program code means for causing a computer to effect comparing the signatures for corresponding structures at compile time, and computer readable program code means for causing a computer to effect, responsive to said signatures not comparing equal, signaling incompatibility.
Also, there is provided an article of manufacture comprising a computer useable medium having computer readable program code means embodied therein for detecting binary compatibility in compiled object code, the computer readable means in said article of manufacture comprising computer readable program code means for causing a computer during compilation of a referring class, to encode characterizing indicia for a referent class into class metadata for said referring class, computer readable program code means for causing a computer during run time processing, to check said characterizing indicia for correspondence, and computer readable program code means for causing a computer responsive to lack of correspondence, to emit an error message. The above article of manufacture may further comprise computer readable program code means for causing a computer to encode as said characterizing indicia at least one signature selected from the set comprising a field block table signature, an instance method table signature, an instance data signature, and a method block signature. And, the above articles of manufacture may further comprise computer readable program code means for causing a computer to process a relocation by checking an assumed signature in the relocation table of said referring class with an actual signature in the referent class, responsive to said signatures matching, to continue execution, and responsive to said signatures not matching, to emit an error message and aborting the application.
There is also provided a method for detecting binary compatibility in compiled object code classes, comprising the steps of generating signatures corresponding to one or more first assumptions of a plurality of possible assumptions for each of one or more corresponding classes, comparing said signatures corresponding to said first assumptions for said corresponding classes at compile time, responsive to said signatures corresponding to said first assumptions comparing equal, determining class compatibility irrespective of changes in assumptions other than said first assumptions, whereby binary compatibility of object code classes is determined at the assumption as distinguished from the class level.
Also provided is an article of manufacture comprising a computer useable medium having computer readable program code means embodied therein for detecting binary compatibility in compiled object code, the computer readable program means in said article of manufacture comprising computer readable program code means for causing a computer to generate signatures corresponding to one or more first assumptions of a plurality of possible assumptions for each of one or more corresponding classes, computer readable program code means for causing a computer to compare said signatures corresponding to said first assumptions for said corresponding classes at compile time, computer readable program code means for causing a computer to determine class compatibility irrespective of changes in assumptions other than said first assumptions responsive to said signatures corresponding to said first assumptions comparing equal, whereby binary compatibility of object code classes is determined at the assumption as distinguished from the class level.
Further, there is provided a computer system for detecting binary compatibility in compiled object code, comprising means for generating a signature for each of one or more structures, means for comparing the signatures for corresponding structures at compile time, and means for signaling incompatibility responsive to said signatures not comparing equal.
And, there is provided a computer system for detecting binary compatibility in object code, comprising means for encoding characterizing indicia for a referent class into class metadata for a referring class during compilation of said referring class, means for checking said characterizing indicia for correspondence during run time processing, and means for emitting an error message responsive to lack or correspondence.