Some examples of modern distributed computing systems include net servers and larger enterprise systems. These and other distributed computing systems often involve databases that contain bundles of data, which can be thought of and referred to as “documents,” and which may be queried and manipulated such as with a structured query language. A computational “node,” in one example of a distributed computing system, may be defined as including the virtual machines and hosts in a single cluster. In this manner, it can be seen that a node of a distributed computing system often has its resources co-located, such as at a single datacenter.
The number of documents in a typical distributed computing system can often be very large. For efficiency purposes, scalability purposes, fault tolerance/redundancy purposes, policy purposes, and/or other purposes, documents of a distributed computing system are often replicated and stored across multiple computational nodes of the distributed computing system in a logical resource pool which can be accessed by any node. Thus, while a document may not be stored locally at a node, it may be stored at another node which is part of the logical resource pool. Document replication often occurs as a result of scaling, attempts to improve efficiency, or attempts to improve fault tolerance/redundancy.
For example, if electricity costs are cheaper at a datacenter which runs on hydroelectric power, many documents may be replicated on a node of this datacenter in order to take advantage of this efficiency of operating in a low cost power environment. If a first node is located at a datacenter in an earthquake prone region, many or all documents stored at that first node may be replicated at other nodes which are located in regions not prone to earthquakes, this provides both redundancy and fault tolerance. Additionally, when a distributed computing system is scaled up in hardware size the overall amount of documents may be allocated among all the nodes in a manner which balances the load experienced by the distributed computing system. In some instances, documents may be replicated at multiple nodes of a distributed computing system in order to simply maintain some redundancy. In some instances, a data replication policy may dictate that all documents are required to be replicated at a second location that is a certain distance (e.g., 500 miles) away from where a document is initially stored. It should also be noted that when data is replicated within a node or on multiple nodes, it can help improve a user experience by reducing latency and better facilitating access by multiple users.