1. Field of the Invention
The present invention relates to computer programming, and deals more particularly with techniques for enabling multiple valid versions of serialized objects (such as Java™ objects) to be maintained concurrently.
2. Description of the Related Art
For data transfer in distributed computing environments, as well as for storing data persistently, it becomes necessary to convert data structures between object format and serialized format. For example, such structured objects may be used when writing programs in the Java™ programming language. (“Java”™ is a trademark of Sun Microsystems, Inc.) Other object-oriented programming languages use different names for their objects. The term “serialization” is used in the art to describe the process of taking an object and transforming it to a “flattened” data structure so that, for example, the contents of the object can be persistently stored or can be passed over a network connection in a serial or stream format. “Deserialization” then refers to the reverse process, whereby a flattened data structure is converted into object format.
In the Java™ programming language, a built-in versioning technique for serialized objects is provided. Each serializable class may declare a variable “serialVersionUID”, which is a 64-bit long integer that will store a stream-unique identifier. (See the discussion of FIG. 3, below, for an example.) The value of this variable is computed by hashing the class's signature—i.e., its name, interface class names, methods, and fields. (The details of this hashing algorithm are not pertinent to the present invention, and will not be described in detail herein.) This versioning technique enables code that is reading a previously-serialized object (e.g., to deserialize the stream back into object form) to determine whether the class definition that this object conformed to when it was serialized is the same class definition used by the code that is currently reading the serial stream. Stated in another way, if the serialVersionUID value is identical between a set of serialized objects, this is an indication that the objects share a common format for serialization and deserialization. If the serialVersionUID values do not match, then the deserialization is not allowed (thereby avoiding creation of a corrupted object).
For example, suppose the class definition for a serialized object is as shown in FIG. 1. Objects created according to this class “ABC” therefore have two fields, “A” and “B”, and the values of these two fields (an integer and a Boolean value, respectively) will be written to an output stream during serialization. Now suppose that the developer changes the class definition 100 for class ABC, adding a third field “C”, to create a new class definition 200 as shown in FIG. 2. (The term “developer” as used herein represents the person or entity that makes a change to a class definition.) If a serialized stream has been created using class definition 100, and code using class definition 200 attempts to deserialize that stream, there will be no value for the newly-added string field “C”. Because the serialVersionULID (“SUID”) is computed over the class definition, each of these versions of class ABC will have a different value for the SUID, thereby allowing the versioning technique to automatically detect that the class definitions are different.
Changes in class definitions are a typical occurrence when a new version of a software product is being created. The changes may correct errors that have been discovered in an earlier version, or new features may be added to the software product which necessitate revising the class definitions. Using SUID values to maintain correct versions of serialized objects is an effective way to maintain compatibility (or to detect incompatibility) between one version of the software product and another version. Among other things, this built-in versioning technique prevents problems if a developer adds an interface to a class that does not exist in previously-serialized objects that were created according to the previous class definition, and may prevent problems when an interface from a newer version is deserialized on an older version of a consuming application, where that older version does not support that interface.
While the built-in versioning technique provides a number of advantages, there are situations in which problems arise. In particular, it may happen that developers need to make changes to the class definition of a serializable object that do not affect the class's interfaces and that do not render previously-serialized objects incompatible. For example, the change might be limited to addition of a copyright statement, or to addition or deletion of fields that are not serialized, in which case the changed class definition will not cause problems for previously-serialized objects. By definition, however, the SUID for the changed class definition will automatically change when the class definition is compiled. Therefore, objects created according to the previous class definition will necessarily be considered incompatible with the new class definition by the versioning support—even though, for a particular change, the objects may in fact still be compatible—thereby preventing those objects from being deserialized.
It is possible with the existing built-in versioning technique for a developer to override the computed SUID value, forcing it to the same value that was computed on an older version of the class. In this manner, the developer could force two different versions of a class to be considered as identical, even though they are not, so that their objects will be treated as compatible. According to the existing versioning technique, each version of a class definition, except the original version, is required to declare the stream-unique identifier, SUID. In the absence of a declaration, the SUID defaults to the hash value computed over the current class definition. Therefore, one class can be defined as backwards-compatible with another by declaring the SUID of the older class definition as the SUID for the new class definition. An example is shown at 300 in FIG. 3, where the class definition 200 from FIG. 2 has been augmented to include a sample SUID declaration at 310. Suppose that this SUID value is the value computed over the class definition 100 from FIG. 1. (A method is provided for obtaining the SUID of any serializable class.) Since the value of the SUID is coded into the class definition 300, this definition will appear to the versioning support as being identical to (and therefore compatible with) class definition 100. To maintain this type of backwards-compatibility in future versions, the developer can simply code the SUID of the previous versions into each new version, and the versions will then appear (to the versioning support) to be identical.
Although this overriding would solve one problem, it would create several others, including:                The benefits of compatibility protection provided by the SUID would be completely negated.        Hard-coding the SUID is an all-or-nothing approach, which does not allow code to optionally account for multiple versions of objects. Once the SUID value is set, it is locked in from that point forward, and prohibits phasing out older versions. (For example, if a version “2” of a class definition is to be compatible with a version “1”, it must include the hard-coded SUID value from version 1. For a version “3” to be compatible with version 2, it must also hard-code this same SUID value. This does not allow supporting backwards-compatibility from version 3 to version 2, but not to version 1.)        All responsibility is placed on the developer for remembering each change made to the class, and determining whether any of those changes are of the type that should be treated as incompatible with the older version (i.e., in which case the SUID value for the new version should be updated).        If the SUID value is overridden, all backwards-compatibility detection among the “actually-different” class definitions is lost.        
Accordingly, what is needed are techniques that avoid these drawbacks of manually overriding the SUID value to set it to the SUID of a previous version, yet allow different class definitions to be treated as compatible.