1. Field of the Invention
Embodiments of the invention relate to distributed systems. More specifically, embodiments of the invention relate to techniques for managing availability of a component having a closed address space.
2. Description of the Related Art
Distributed systems include software components running on different computers and interacting with each other via a network, frequently as part of a larger distributed application. Examples of distributed applications include scalable information retrieval and/or document processing applications that exploit machine-level parallelism. In some cases, a distributed application may include a component that may be unable to participate in an availability protocol of the distributed application. Such a component may be said to have a “closed address space” (i.e., an address space that is closed to the distributed application). In other words, a component that has a closed address space refers to a component that, while deployed as a standalone process in the distributed system, may not be extended or modified to provide and/or manage availability of the component using the availability protocol of the distributed application.
A typical example of a component having a closed address space is a third party component deployed as part of a larger distributed application. For instance, the component may provide a service for converting documents or extracting information from documents. However, the component may be closed to modification and/or extension by a developer of the larger distributed application. For example, a vendor of the component may not provide any source code, software development kit, or integration hooks for the component that would allow the developer to modify or extend the component to provide and manage availability of (a runtime instance of) the component within the framework of the availability protocol of the larger distributed application. Put another way, the developer of the distributed application may be unable to modify or extend the component to participate in the availability protocol of the distributed application.
Often, the availability of the component executing in the closed address space affects availability of the larger distributed system. For example, an unresponsive component may cause the larger distributed system to behave less responsively and/or less reliably. Further, because the component executes in the closed address space, the distributed system may be unable to determine whether the component is hung. Consequently, the distributed system may be unable to determine whether a performance issue is caused by a dead or hung component, a hardware failure, etc. Because the distributed system may not take into account or otherwise manage the availability of the component, reliability of the distributed system may suffer.