The invention is based on a priority application EP 08 290 624.9 which is hereby incorporated by reference.
The present invention relates to a method of redundant data storage in a peer-to-peer overlay network, and a network node and a computer program product to execute said method.
Peer-to-peer (=P2P) overlay networks are used for a broad number of applications such as VoIP (e.g., Skype) or file-sharing (e.g., eMule) (VoIP=Voice over Internet Protocol). Features like high scalability, self-organisation and fault tolerance are achieved by a distributed architecture and data storage on collectively provided and used resources. The P2P network consists of nodes acting as peers, i.e., both as clients and as servers of the network. In the following description, the terms “node” and “peer” are used interchangeably. Each node of the network maintains one or more logical links to other nodes that are established with respect to an overlay algorithm and used for message transmission.
State-of-the-art P2P networks build a logical topology structure based on overlay specific algorithms that are agnostic of the underlying network infrastructure. Each node maintains one or more links to other nodes that are used for message routing in a broadcast manner (unstructured overlays, e.g. Gnutella) or in an ID-based manner (structured overlays using a Distributed Hash Table (=DHT), e.g. Chord) (ID=identification/identifier). Some systems use hybrid architectures with unstructured groups of peers, and these groups are the structured in a larger topology (e.g. Skype). Advanced P2P networks implement a distributed database (DHT) that requires replication mechanisms to ensure that the stored data persists also in case of an ungraceful leave of a node from the network.
P2P networks are totally decentralised. Peers participating in the overlay may leave the network ungracefully and at random time. Redundant storage by data replication on several nodes guarantees that the data is still available in the network, even after the peer that was responsible for the data has quit the overlay. The redundancy mechanisms are based on overlay specifics, e.g. neighbourhood relationships between peers closely together in the peer identifier space. High availability of the data is achieved by publishing the data on one node that is responsible for the data entry and on one or more nodes that keep a backup entry. Higher availability is achieved by storing multiple replicas on multiple neighbour nodes. In a Peer-to-Peer network based on the Chord algorithm, each peer node replicates the set of resources for which it is responsible on the neighbouring peer nodes in terms of peer ID.
DHTs usually store key/value pairs of data whereas the keys are mapped to the ID space of the nodes. In Chord each node is responsible to store those keys that are equal or smaller than its own ID and greater than its predecessor ID.
FIG. 1a shows a prior art structured P2P overlay network N comprising a plurality of nodes 0, 20, 50, 87, 112, 140, 179, 200, 211, 223, 240. In particular, FIG. 1a illustrates a Chord ring N. The reference signs 0, 20, 50, 87, 112, 140, 179, 200, 211, 223, 240 of the nodes are meant to represent also the node IDs of the nodes 0, 20, 50, 87, 112, 140, 179, 200, 211, 223, 240. In Chord the nodes maintain direct connections to their predecessor and successor nodes, which results in a ring topology. The node with the node ID 211 is responsible for all keys in the interval {201, 202, . . . , 211}. The key/value pair with the key ID 203 is thus stored on the node 211, and the replicas of the data of node 211 are stored on the successor node 223, as indicated in FIG. 1a. If the node 211 leaves the network N, as shown in FIG. 1b, the node 223 becomes responsible also for the ID space of node 211, including the key 203. Thus the data of node 211 is maintained in the network N.
WO 2007/138044 A1 relates to a P2P communication device, e.g., a PDA, a desktop computer, or a laptop, comprising a memory in which a peer-to-peer identification indication of the P2P communication device is stored, said indication comprising a distinct, non-modifiable part and a modifiable part (PDA=Personal Digital Assistant). A user of the P2P communication device is entitled to freely choose the modifiable part but not able to change the distinct, non-modifiable part which is definitely allocated to the user. For instance, a user is allowed to select the value of a byte to be added to the end of a pre-determined 9-byte UUID in order to form a complete 10-byte UUID (=Unique User Identifier). This is to ensure that two or more P2P communication devices associated with a single user are located close to each other in a P2P network. An advantage of this neighbouring location is that keep-alive messages and information on changes of a neighbour list can be fast exchanged between the P2P communication devices without burdening the IP network underlying the P2P network (IP=Internet Protocol). A disadvantage of this neighbouring location is that it is very likely that the P2P communication devices are connected to the P2P network via the same network entity, e.g., a router. In case this router fails, all the P2P communication devices are disconnected at the same time.