A distributed computing system is a group of processing units—frequently called nodes—which work together to present a unified system to a user. These systems can range from relatively small and simple—such as multi-component single systems—to world-wide and complex, such as some grid computing systems. These systems are usually deployed to improve the speed and/or availability of computing services over that provided by a single processing unit alone. Alternatively, distributed computing systems can be used to achieve desired levels of speed and availability within cost constraints.
There are different types of decision making functions within a distributed computing system, but they can generally be categorized as one of two types—either centralized or decentralized. Centralized decision making functions have a designated center point by which and through which decisions for the entire system are made. However, centralized decision making procedures have the drawback that it is difficult for distributed system to deal with the loss of the node which implements the decision making function.
A response is to decentralize the decision making functions—allow more than one node to coordinate activity. Simple implementations of this idea provide for redundant coordinating nodes. Various routines have been developed to allow a group of nodes to cooperate for the purpose of selecting a new decision making node. Other independent decision systems build decision models into each node, allowing each node to come to its own best decision about what to do.
In practice, decentralized decision making functions have significant drawbacks. The first problem is that systems which use decentralized decision making are typically harder to build, harder to debug, and harder to employ. The redundancy in hardware and software required by this approach can reduce system performance and raise system costs. Further, decentralized systems are susceptible to inconsistent decisions between nodes due to differences in information. For example, inconsistent decisions are a common problem in routers. Since each node presumably possesses a valid routing table, the routing tables must be consistent to achieve the desired result. However, changing circumstances can lead to local routing table modifications; these modifications can lead to inconsistent decisions—“routing loops”—which forward packets in an endless circle. Routing loops have historically plagued routing, and their avoidance is a major design goal of routing protocols. Similar issues arise in other decentralized decision making systems.