Distributed systems or distributed applications function as a whole yet operate in components on different computers. Components may have local states and may communicate and cooperate by network to form a global state. In particular, different applications on different hosts may share an object such that either may update the object and the object is treated as a single object. In some cases, the distributed components may have some shared global data (e.g., state of the shared object) that each uses by way of a local copy. Programming such distributed systems is difficult for several reasons. When network latency is high, communications between components may be delayed, making it difficult to keep any shared state consistent. Furthermore, logic to maintain concurrency between components, when desired, is inherently difficult to program and when executed may affect global and local performance. Various types of expected failures (nodes, links, disks, etc.) create complications. The desire to maintain consistency between components on various machines may need to be balanced against performance requirements.
There are programming models for collaborative distributed applications, such as where a group of users or clients are collaborating to perform a shared task. With a centralized synchronization model, shared data may be maintained on a central server, but this can affect responsiveness due to latency and contention at the server. On the other hand, caching or replicating shared data locally at each machine may involve non-trivial techniques to keep such replicated shared data consistent. An inherent tradeoff between responsiveness and consistency can be seen with two extremes of each approach. At one extreme, a copy serializability model commits every action on shared data at the same time and once committed such changes are made visible to all machines at the same time. While reliable, copy serializability is inherently slow. At the other extreme, a replicated execution model provides each machine with its own local copy (and only this copy) of the shared data and local copies are updated independently. While performance is high, there is little or no consistency between the states of the various machines.
Discussed below are techniques related to programming models that balance consistency and performance in distributed systems.