As it is known by the man skilled in the art, in a distributed system, such as a peer-to-peer network, some contents are disseminated between peers in order each of these peers had at its disposal these contents.
Dissemination can be performed by means of a multicast-like protocol, which uses trees, or by means of a fully decentralized and random protocol.
A (random) dissemination protocol is a protocol which needs each peer it runs to be supplied with random samples of other peers. These random samples of peers can be determined, for instance, by a gossip-based peer sampling protocol. This last protocol allows building neighborhood without a central server and is therefore suitable for frequent updates of the neighborhood.
In gossip-based peer sampling, peers periodically choose random neighbors, exchange therebetween their respective knowledges about other peers and merge their own knowledge with the one received from the other peers in order to exclude old knowledge (for instance the potentially dead peers) while minimizing the loss of information. Indeed, to avoid losing information, after a neighbor information exchange between two peers the information hold by these two peers must still be different enough. Such gossip-based peer sampling protocols produce random graphs as overlays (i.e. logical networks). More information about gossip-based peer sampling protocols can be found in the article of Spyros Voulgaris et al “Gossip-Based Peer Sampling”, ACM Transaction on Computer Systems 2007, Volume 25.
Random dissemination protocols can be classified into pull protocols and push protocols.
When a pull protocol is implemented, peers have to require to their neighbors missing parts of contents. So this type of dissemination protocol requires bi-directional channels between peers and an important control overhead. One of the main pull protocols used for disseminating contents is the Bittorrent protocol. In Bittorrent, a content is cut in blocks or chunks, and all peers cooperate in a reciprocal manner so that each peer gets all the chunks of this content. This technique is often referred as “content swarming”.
Pull protocols, like Bittorrent, do not suffer from a data overhead, but they induce an important control overhead. Moreover, the resulting disseminations have slow startups and high delays. Therefore, they are not suitable for low delay dissemination.
When a push protocol is implemented, a peer which receives a block or chunk of a content randomly selects F neighbors and forwards them this received chunk. Each chunk has a lifetime, which is the number of hop a chunk may undergo and which is decreased each time it is forwarded from a peer to another peer. When the lifetime becomes zero, the chunk is no more forwarded. Push protocols ensure a fast startup. They do not need feedback channel and have a low control overhead.
Push protocols are mainly controlled by two parameters: the “fanout”, which is the number of random peers to which a block or chunk is forwarded, and the lifetime. The delay is highly influenced by the fanout, and the degree of completion (i.e. how many peers got a complete content) is influenced by the product of the fanout and the lifetime. The higher this product is, the higher the overhead of data will be. One means here by “data overhead” data received by a peer more than one time.
Push protocols work quite well when a full content completion is not needed. Indeed, the last peers may be hard to reach (even if an encoding with forward erasure correcting (FEC) codes is used), and therefore the lifetime and fanout must be increased, which increases the data overhead.
More information about push protocols can be found in the article of Patrick T. Eugster et al “From Epidemics to Distributed Computing”, IEEE Computer, Volume 37, 2004.
A third type of dissemination protocols can be also implemented. It comprises the push-pull protocols. In this third type of protocols, push phases are followed by pull phases. A peer that missed some parts of a content during a push phase will request the missing parts of this content during following pull phase(s). Information about available content blocks being attached to content data, the control overhead is low.
These push-pull protocols have low delay and low control overhead since the low delay is ensured by the push phase and completion is ensured by the pull phase(s). So, increasing fanout or lifetime is not needed. Moreover, a full content dissemination near every peer does not induce an increase of the data overhead. The main drawback of these push-pull protocols is their non easy implementation because they are more complex and that their parameters are more difficult to adjust (and especially the one defining when we are in a push mode or in a pull mode).
More information about push-pull protocols can be found in the article of Sujay Sanghavi et al “Gossiping with multiple messages”, IEEE International Conference on Computer Communications, 2007 (INFOCOMM 2007).