The storing of information in a network has traditionally followed the client-server model, i.e. the information is stored centrally in servers which are accessible by a number of clients. Typical examples are web servers that are accessible over the Internet from clients (home computers, mobile devices etc) located all over the world. The client-server model has more and more been challenged by the peer-to-peer (P2P) model. In contrast to the client-server model the peer-to-peer model has no distinction between clients and servers in the network. A node (also called a peer) can be both a client and a server at the same time and can access information stored in other nodes and provide information to other nodes. A network comprising these nodes/peers is called a peer-to-peer (P2P) network. P2P networks are usually overlay networks on top on an existing IP network such as the Internet. A well known example of a P2P network is the set of nodes (such as personal computers) connected to each other using the P2P protocol BitTorrent.
One advantage with P2P networks is that information (here also called objects) can be distributed. The information is not located in a single point of failure such as the server in a client-server network. P2P networks are also more scalable than client-server networks. On the other hand, a search for an object in a client-server network is relatively easy whereas a search for an object in a P2P network is more complex. The problem is to find out in which node the requested object is located. One technique is flooding, that is to send a search message to all nodes in the network. This is a simple technique but limited to very small networks. For larger networks the traffic load generated by the search messages will become very large. To overcome this, the BitTorrent network comprises a centralized server called a BitTorrent tracker. This tracker keeps information about where (in which nodes) the objects are located. Again, if only one tracker is used it becomes a single point of failure. This means that this tracker needs to be very reliable and have a large processing capacity to avoid to become overloaded when the network grows.
In order to design a ‘flat’ structured overlay network (without a centralized tracker) other techniques for locating information have to be used. One technique that has been suggested is to use key-based routing, also known as Distributed Hash Tables (DHT). A regular, non-distributed hash table is a data structure that associates keys with values. A key can be for instance a person's name and the corresponding value of that person's contact address (e.g., email address or Session Initiation Protocol (SIP) Uniform Resource Identifier (URI)). The primary operation a hash table supports is lookup; given a key, the hash table finds the corresponding value. Distributed Hash Tables (DHTs) provide a lookup service similar to a hash table. However, unlike regular hash tables, DHTs are decentralized distributed systems. In a DHT, the responsibility for maintaining mappings from names to values is distributed among the nodes participating in the system. This is achieved by partitioning the key space among the participating nodes. The nodes are connected together by the overlay network, which allows the nodes to find the owner of any given key in the key space. The partitioning of the key space can for instance use a ring topology. A DHT ring has for example been discussed in the paper “Chord: A scalable Peer-to-peer Lockup Protocol for Internet Applications” by Ian Stoica et al published in 2001 in relation to the SIGCOMM '01 conference. In this paper a DHT ring topology is disclosed. Each node is organized on this ring and is in charge of a set of keys.
The nodes in a P2P network are however not limited to store contact information for certain persons etc. The nodes can also be service nodes (providing services to other nodes) and nodes that request services from other nodes. An example of a P2P overlay network with services is the Peer-to-Peer Session Initiation Protocol (P2PSIP) overlay network. This overlay network can for example offer a TURN relay service (TURN=Traversal Using Relays around Network Address Translation), a voice mail service, a gateway location service, and a transcoding service.
If DHT is to be used for locating service nodes, one approach is to use the name of the service as the key, or rather a hash of the service name such as hash(‘voice mail service’).
A problem with this solution is that it does not scale very well. The number of nodes providing a specific service is often very small compared to the total numbers of nodes in the overlay network. This means that a lot of service lookups (or service location requests) need to be distributed in the overlay network. In addition all service providers have to use the same key. As the corresponding contact addresses to the service providers are stored in the same node, this means that the node has to store a lot of contact addresses and will become overloaded.