A distributed system typically includes a number of interconnected nodes. The nodes often include a processor and memory (e.g., Random Access Memory (RAM)). In addition, the nodes also include the necessary hardware and software to communicate with other nodes in the distributed system. The interconnected nodes may also communicate with each other using an overlay network.
Nodes belonging to the overlay network route messages between each other using the underlying networking infrastructure (e.g., Internet Protocol (IP) and Transmission Control Protocol (TCP), etc.). While the underlying network infrastructure has the information and capability to directly route messages between specific computers, overlay networks typically maintain only partial routing information and rely on successive forwarding through intermediate nodes in order to deliver a message to its final intended destination. One common use for overlay networks is to build distributed hash tables (DHT). In one implementation, each node in the overlay network is associated with a Globally Unique Identifier (GUID) and stores a part of the DHT.
When a node (i.e., the requesting node) requires a piece of data stored on a node (i.e., a target node) in the overlay network, the requesting node determines the GUID associated with target node, which contains the requested data. The requesting node then queries its routing table entries (i.e., the DHT entries) to find the node (i.e., an intermediate node) with the GUID closest to the target node's GUID. The request is then forwarded to the intermediate node. The intermediate node follows the same process, comparing the target node's GUID with the intermediate node's routing table entries. The aforementioned process is repeated until the target node is reached. Typically, the overlay network maintains enough information in the DHT to determine the appropriate intermediate node.
To store data in the aforementioned overlay network, the data is loaded onto a particular node (i.e., a target node) in the overlay network and is associated with a GUID. The node that stores the data subsequently publishes the presence of the GUID on the node. Another node (i.e., the root node) in the network stores the necessary information in its DHT to indicate that the data associated with the GUID is stored in the target node. It is important to note that any given node in the overlay network may operate as both a target node (i.e., stores data) and as a root node (i.e., maintains a DHT). Typically, a given root node is only responsible for a certain range of GUIDs.