Development of distributed information processing systems involves several challenges. A developer is concerned not only with the data processing logic involved in processing data to achieve desired results, but also with the coordination of the distributed components within the system. An example of a coordination problem is the synchronization of components that are executing in parallel.
An example of such a distributed system is a processing system comprised of multiple nodes, in which each node receives, indexes and stores data about ongoing streams of documents, such as within an enterprise. Each node in the system processes input data streams according to a set of independent, asynchronous processing components connected in a manner represented by a directed graph. The set of processing components defines the processing to be performed on the input streams and provides output data, such as indexing information, for storage along with the input data. Such a system is described, for example, in U.S. Patent Publications 2010/0005147 and 20120096475, hereby incorporated by reference.
Development also involves making frequent changes to a system. In particular, a developer makes changes, runs tests, identifies errors or processing improvements to address, and then makes more changes. In this iterative process, valuable time can be lost if the process of updating the system with the changes takes any substantial amount of time. For example, if a system needs to be shut down, modified then rebooted and restarted with a new configuration, substantial time can be lost and ongoing services, such as servicing queries, can be interrupted. If errors exist in the modifications that prevent a stable configuration from emerging, the system might need to be reset back to its original state prior to any changes being made. It would be desirable instead to be able to update such a system while it is running.