Peer to peer (P2P) networks are distributed data networks without any centralized hierarchy or organization. Peer to peer data networks provide a robust and flexible means of communicating information between large numbers of computers or other information devices, referred to in general as nodes. In a P2P network, each node within the P2P network is defined as a peer of every other computing system within the network. Each node within the P2P network may be configured to execute software having substantially equivalent functionality. Therefore, each node may act as both a provider and a user of data and services across the P2P network. Peer to peer data networks provide a robust and flexible means of communicating information between large numbers of computers or other information devices, referred to in general as nodes.
A P2P network relies primarily on the computing power and bandwidth of the nodes in the network rather than concentrating it in a relatively low number of servers. P2P networks are typically used for connecting nodes via largely ad hoc connections. Such networks are useful for many purposes. P2P networks may be used, e.g., for sharing content files containing audio, video, data or anything in digital format is very common, and real-time data, such as telephony traffic, may also be transmitted using P2P technology.
An overlay network is a logical or virtual network organization that is imposed on nodes connected by one or more types of underlying physical network connections. In an overlay network, nodes are connected by virtual or logical links, each of which can correspond with one or more paths in an underlying physical network. Overlay network are typically implemented in hardware and/or software operating in the application layer or other top-level layer of an OSI network stack or other type of networking protocol.
One class of peer to peer overlay networks are referred to as distributed hash table networks. Distributed hash table overlay networks use a hash function to generate and assign one or more key values to a unique node. The set of all possible key values is referred to as a hash space. Nodes are organized in the hash space according to their assigned key values. The hash function is selected so that nodes are approximately evenly distributed throughout the hash space. Distributed hash table overlay networks are typically highly scalable, often supporting millions of nodes; robust, allowing nodes to join or leave frequently; and efficient, routing a message to a single destination node quickly.
There are numerous different types of distributed hash table overlay networks. One type of peer to peer overlay network is known as a Chord network. The Chord overlay network protocol is described in detail in “Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications”, Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan, IEEE/ACM Transactions on Networking, Vol. 11, No. 1, pp. 17-32, February 2003, which is incorporated herein by reference. Another type of distributed hash table overlay network is Pastry, which is described in “Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems,” A. Rowstron and P. Druschel. IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, pages 329-350, November, 2001, which is incorporated herein by reference.
A Chord overlay network may exhibit logarithmic properties arising from “asymptotic complexity” of messaging. For example, if there are N nodes in a Chord ring and a first node wants to send a message to a second node, the first node typically has to communication with some subset of the N nodes in order to locate node B. In a Chord overlay network, the first node generally has to communicate with a very small subset of all N nodes, specifically log2 N. This property allows a Chord overlay network to have relatively fast messaging, even for a very large number N of nodes. However, a Chord overlay network can only guarantee this log2 N messaging property if the IDs of the nodes are completely randomly distributed around the Chord ring.
Although distributed hash table overlay network protocols, such as the chord protocol, provide efficient distribution of a message to a single destination node, they do not allow for a single message to be efficiently distributed to multiple destination nodes, referred to as broadcasting (or multicasting) a message.
In one typical implementation, a node desiring to broadcast a message to all of the other nodes must send a message to each node separately. As each node only has direct knowledge of a limited number of nodes, a node initiating a broadcast message, referred to as an initiating node, must blindly send messages to all possible key values. For distributed hash table networks, this entails sending a separate message to each possible key value. For a distributed hash table network with a hash space of 2^160 (arising from the use of a 160-bit hash function such as SHA-1), this is unfeasible.
In another typical implementation, a flooding approach is used to distribute a broadcast message. An initiating node sends a message to all of the nodes directly connected with the initiating node in the overlay network. Upon receiving the message, each receiving node in turn forwards the message to any additional nodes directly connected with each receiving node in the overlay network. This implementation is inefficient, as some nodes receive the same message more than once. Moreover, this implementation consumes a large amount of network bandwidth and takes a large amount of time to implement.
To reduce the bandwidth required by flooding broadcast messages, a modified flooding scheme assigns a time-to-live (TTL) value to each broadcast message. Each time a copy of a broadcast message is forwarded to additional node, its TTL value is decremented. When the TTL value reaches 1, the broadcast message is no longer forwarded. Although this modified flooding scheme reduces the amount of wasted network bandwidth and the number of duplicate messages, it cannot ensure that the broadcast message will be routed to all nodes.
It is therefore desirable for a system and method to guarantee each node in a peer to peer overlay network receives a broadcast message. It is further desirable that the system and method guarantees that each node in a peer to peer overlay network receives only one copy of a broadcast message, thereby ensuring that network bandwidth is efficiently utilized. It is further desirable that the system and method require minimal time and bandwidth resources from a node initiating a broadcast message. It is also desirable that the system and method enable broadcast messages to be selectively directed to portions of the overlay network with no additional network bandwidth overhead. It is desirable for the system and method to deliver broadcast messages to all or a selected portion of the peer to peer overlay network within a minimal time period.
It is within this context that embodiments of the present invention arise.