1. Field of the Invention
The present invention is directed to computer systems. More particularly, it is directed to determining the identity of executable objects at run time in computer systems.
2. Description of the Related Art
Many modern programming languages, such as Java™, provide a simple way to establish the identity of different programming language constructs at compile time: in the source code of a Java™ program, for example, two classes are identical if they have the same name, and differ if they have different names. However, the problem of establishing identity may be somewhat more complicated at run time. Several computing platforms, such as various versions of the Java™ platform, enable processes to load application and system code at run time from a variety of sources, such as a local file system, a remote web server, or an in-memory buffer. More than one class with the same name, potentially with significantly different behaviors, may be present in the different sources: for example, two versions of a given Java™ class may be available, one locally and one at a remote source.
In order to distinguish between different classes with the same name at run time, class loaders may sometimes be used as name space indicators for the classes. In Java™ run time environments, for example, a class loader is responsible for mapping a class name (e.g., a string) to a loaded class object. A Java™ class loader is itself an instance of a class, which is a subclass of the abstract java.lang.ClassLoader class, that provides a method such as loadClass allowing callers to request loading of named classes. The identity of the loaded class at run time in such environments is based on the tuple [class name, defining class loader]: that is, for two classes to be identical, both must have the same class name and both must have the same defining class loader. The “defining” class loader of a Java™ class is the particular class loader that passes the definition of a currently-unloaded class to the Java™ Virtual Machine (JVM) for processing, receives an initialized class object from the JVM and returns the class object to the requester. It is noted that the terms “JVM”, “virtual machine process”, “virtual machine” and “process” may be used synonymously herein to indicate execution environments at which applications comprising dynamically loadable classes are executed.
Questions of class identity are particularly relevant in distributed programs, for example programs that use Java™ Remote Method Invocation (RMI) or the Jini™ networking technology (subsequently referred to herein as “Jini”). Such distributed programs may rely on passing objects between processes, where the executable code for a transmitted object may not be preinstalled at the destination. Individual processes at the different nodes of a distributed programming environment may have access to different versions of the class for the same object, which may potentially lead to errors and failures that are hard to diagnose and resolve. Consider an example scenario in which a first process “P1” sends an object “obj-1”, which is an instance of a class “C”, to a second process “P2”, intending that a version “v1” of class “C” be executed for object “obj-1” at process “P2”. The run-time environment in use in the scenario may not, however, require that “P2” execute any specific version of class “C”. If process “P2” has access to a locally available version “v2” of class “C”, it may execute the locally available version, which may perform different computations than were expected by process “P1”, potentially resulting in errors in the distributed computation being performed by “P1” and “P2”. Since neither “P1” nor “P2” does anything in this scenario that violates any rules, the error introduced by the naming ambiguity of different class versions corresponding to “obj-1” may be hard to detect, and therefore hard to fix. Disambiguating between classes with the same name may thus be even more important in distributed programming environments.
A number of different approaches have been used for establishing class identities at run time. In the standard implementation of Java™ RMI, for example, the class loader used to download and define a class is determined by the location or “codebase” from which the class is downloaded. A process that sends a class to another process is responsible for also sending a “codebase annotation” (e.g., one or more Uniform Resource Locators (URLs) from which a class is to be downloaded) for the class to the receiving process. If different codebases implement different versions of the same class, this approach will disambiguate between the versions, since each version will have a different defining class loader. Unfortunately, however, if different codebases implement identical versions of a given class, this approach will still treat the versions as distinct, which may contribute to one or more of a number of problems. Such problems include, for example, potential loss of codebase annotation when objects are relayed from one process to another in distributed applications, unexpected type conflicts when codebase changes occur (e.g., when a hostname or port corresponding to a codebase changes), unnecessary memory usage caused by loading multiple identical versions of the same class at a single process, etc.
Several enhancements to the standard codebase approach have been proposed. In a technique called “preferred classes”, downloaded applications explicitly specify that a subset of their classes should not be shared with the local platform, thus avoiding some of the confusion possible in the standard codebase approach. However, this technique requires that a decision be made in advance of application deployment as to which classes should be shared and which classes should be kept separate, independent of whether locally available classes are compatible with the downloaded application. Such a technique leaves open the possibility that unexpected versions of classes available locally are used, and that applications may fail to share locally-available classes compatible with downloaded code. Other approaches, such as a technique called “content-addressable codebases”, may be sensitive to how classes are packaged within codebases. If two downloaded objects have some classes in common, but use codebases that contain different additional classes, then the content-addressable codebase approach fails to treat the common classes as identical. Traditional techniques for class loading (and therefore, for class identity disambiguation) are often problematic at least partly because they rely in some form on the location of class definitions or the specific contents at each location.