Distributed and parallel systems form a very important segment of modem computing environments. Experience with such systems has exposed several requirements of system and component designs that have historically been recognized only after a system has been deployed. A critical requirement (especially for systems with any longevity) is the need for the system and system components to be able to evolve over time.
By definition, a distributed system is one which contains components which need to communicate with one another. In most practical systems, however, many of these components will not be created “from scratch”. Components tend to have long lifetimes, be shared across systems, and be written by different developers, at different times, in different programming languages, with different tools. In addition, systems are not static—any large scale system will have components that must be updated, and new components and capabilities will be added to the system at different stages in its lifetime. The choice of platform, the level of available technology, and the current fashion in the programming community all conspire to create what is typically an integration and evolution nightmare.
The most common solution to this problem is to attempt to avoid it by declaring that all components in the system will be designed to a single distributed programming model and will use its underlying communication protocol. This approach tends not to work well for several reasons. First, by the time the decision has been made to use one model or protocol (which may be quite early in the life cycle of a system) there may already be existing components which there is a desire to use, but which do not support the selected model or protocol. Second, the choice of model and protocol may severely restrict other choices (e.g., the language in which a component is to be written or the platform on which it is to be implemented) due to the availability of support for the model.
Finally, such choices tend to be made in the belief that the ultimate model and protocol have finally been found, or at least that the current choice is sufficiently flexible to incorporate any future changes. That belief has, historically, been discovered to be unfounded—a situation which is not likely to change. Invariably, a small number of years down the road (and often well within the life of an existing system), a new “latest and greatest” model is invented, and the owner of the system is faced with the choice of adhering to the old model (which may leave the system unable to communicate with other systems and restrict the capabilities of new components) or upgrade the entire system to the new model. This approach is always an expensive option, and may in fact be intractable (for instance, it is not unheard of for systems to contain an investment of hundreds of man-years in “legacy” source code) or even impossible (as, for example, when the source code for a component is simply not available).
An alternative solution accepts the fact that a component or set of components may not speak the common protocol, and provides proxy services (also known as “protocol wrappers” or “gateways”) between the communication protocols. Under this scheme, the communication is first sent to the proxy service, which translates it into the non-standard protocol and forwards it on to the component. This technique typically gives rise to performance issues (due to message forwarding), resource issues (due to multiple in-memory message representations), reliability issues (due to the introduction of new messages and failure conditions), as well as security, location, configuration, and consistency problems (due to the disjoint mechanisms used by different communication protocols).
It is tempting to think that this problem is merely a temporary condition caused by the recent explosion in the number of protocols (and that things will stabilize soon) or that the problem is just an artifact of poor design in legacy components (and won't be so bad next time). However the problem of protocol evolution is intrinsic in building practical distributed systems. There will always be “better” protocols, domain specific motivations to use them, and “legacy” components and protocols that must be supported. Indeed, nearly any real distributed system will have at least three models: those of “legacy” components, the current standard, and the emerging “latest and greatest”. The contents of these categories shift with time—today's applications and standard protocols will be tomorrow's legacy. Systems and components evolve along multiple dimensions:
Evolution of Component Interface
A component's interface may evolve to support new features. The danger is that this evolution will require all clients of the component to be updated. For reasons cited in the previous section, there must be a mechanism whereby old clients can continue to use the old interface, yet new clients can take advantage of the new features.
Evolution of Component Implementation
A component's implementation may evolve independently of the rest of the system. This may include the relocation of a component to a new hardware platform or the reimplementation of a component in a new programming language. There must be a mechanism which insulates other components from these changes in the implementation yet maintains the semantic guarantees promised by the interface.
Evolution of Inter-Communication Protocol
It is generally intractable to chose a single communication protocol for all components in the system as new protocols are attractive due to their performance, availability, security, and suitability to the application's needs. Each communication protocol has its own model of component location, component binding, and often a model of data/parameter representation. It must be possible to change or add communication protocols without rendering existing components inaccessible.
Evolution of Inter-Component Communication Model/API
The programming models used to perform inter-component communication continue to evolve. Existing models change over time to support new data types which can be communicated and new communication semantics. At the same time, new programming models are frequently developed which are attractive due to their applicability to a particular application, their familiarity to programmers on a particular platform, or merely current fashion or corporate favor. It must be possible to implement components to a new model or a new version of an existing model without limiting the choice of protocols to be used underneath and without sacrificing interoperability with existing components written to other models or other versions of the same model (even when those components will reside in the same address space).
Distributed Object Systems such as CORBA and COM, like the Remote Procedure Call models which preceded them, address the issue of protocol evolution to a degree by separating the programming model from the details of the underlying protocol which is used to implement the communication. These systems do so by introducing a declarative Interface Definition Language (IDL) and a compiler which generates code that transforms (or allows the transformation of) a protocol neutral Application Protocol Interface (API) to the particular protocol supported by the model. As the protocol changes (or new protocols become available), the compiler can be updated to generate new protocol adapters to track the protocol evolution.
A side benefit of IDL is that it forces each component's interface to be documented and decouples a component's interface from its implementation. This allows an implementation to be updated without affecting the programming API of clients and simplifies the parallel development of multiple components.
In CORBA and COM, interfaces are reflective—a client can ask an implementation object whether it supports a particular interface. Using this dynamic mechanism, a client can be insulated from interface (as well as implementation) changes as clients familiar with a new interface (or a new version of an interface) ask about it, while old clients restrict themselves to using the old interface.
While such systems abstract the choice of communication protocol, none addresses the situation in which a system needs to be composed of components that cannot all share a single protocol or a single version of a protocol. CORBA and COM have each defined a protocol that all components “will eventually adopt”. For reasons cited above, that solution is merely the addition of yet another (incompatible) protocol to the mix—a protocol which will evolve, and in fact is already evolving.
For all of these reasons, having a single protocol in a long-lived, large-scale system, is unrealistic. There will be evolution of protocols (IIOP 1.0, 2.0, 3.0) and simultaneous and incompatible protocols (MS-RPC, DCOM, SOAP/.NET) in these systems. One issue is the different encoding rules between the protocols (this is addressed in U.S. Pat. Nos. 6,282,581 and 6,408,342, issued to Moore et al.). A second issue involves handling the differences in discovery, registration and rendezvous mechanism.
One approach to handling these differences is disclosed in U.S. Pat. No. 6,349,343 (the '343 patent) issued to Foody et al. The '343 patent discloses that a bridge is created between DSOM, ORBIX, and COM by the introduction of “proxy objects”. The proxy objects are created at the application level and are knowledgeable about conversions between the various protocols. One drawback to the use of the “proxy objects” described in the '343 patent is that they are created as specific interface application level proxies. In addition, administrative tools are needed for registering the “proxy objects”, thus requiring relatively complicated configurations to enable their implementations.
Another approach is VJ++/COM disclosed in chapters 14 and 15 of “Inside Visual J++” by Karanjit Siyan, published by NewRiders. Siyan discloses the use of a virtual machine that has knowledge of how to dispatch calls from Visual J++ to COM. Siyan also discloses that the virtual machine has additional knowledge of how Visual J++ objects can be registered as COM objects through command line activation. The virtual machine of Siyan requires the use of specialized opcodes for the Java virtual machine. In addition, only a single fixed object system (i.e., COM) is supported in the VJ++/COM approach disclosed in Siyan, which substantially limits the accessibility of the VJ++/COM approach.