Peer-to-peer (P2P) file-sharing networks may be used to distribute large amounts of data between users on a network. By way of example, BitTorrent is a commonly-used protocol for transferring large files on the Internet, and is estimated to account for about 25% to 35% of all Internet traffic.
In a typical P2P (peer-to-peer) file-sharing scenario, a content file (e.g., movie or application) is seeded (disseminated) to one or more P2P clients running on host(s) connected to a network. In some P2P protocols, a tracking file (e.g., a “torrent” file in a BitTorrent network) is distributed that identifies the content file and tracking hosts (also known as “trackers”) that can provide information on how to contact clients sharing the content file (e.g., the seeded P2P clients, as well as other clients that may have copies of fragments of the content file). Peers obtain various fragments of the content file and share these fragments with other peers until all peers interested in the content obtain copies of all fragments, and hence have a complete copy of the content. Sharing often continues even after the original seeded content file has been removed.
In the BitTorrent protocol, trackers keep track of clients (peers) who are interested in obtaining and hosting fragments of the content file. Each peer communicates with other peers to announce which fragments they can provide and determine which fragments they can receive. Peers then exchange fragments in a “tit-for-tat” (TFT) sharing scheme that attempts to maintain parity between the amount of data received and the amount of data given. Peers involved in content-fragment exchanges for a particular content file are sometimes referred to as a “swarm” related to that content file.
Peers have different relations depending upon whether they are seeking a content fragment, or whether they possess a content fragment. Each peer in the system is either a leecher that is trying to obtain the complete content, or a seed that has complete content. However, all peers are serving content to other leechers using certain performance-aware policies. Typically, each peer makes a persistent TCP connection to each of its neighbors, who in turn add the peer onto their neighbor set, and learns about the pieces they have. Each neighbor also sends updates on the list of pieces they have, when they have a new piece.
In the BitTorrent protocol, a peer A, typically, uploads to only five interested leechers that are seeking content possessed by peer A. Of these five leechers, four (default) are selected based on their attractive transfer rate (upload rate if peer A is a seed, or download rate if peer A is a leecher). By default, all neighbors are choked, which prevents the neighbors from requesting pieces. When a peer is ready to upload content, it sends an unchoke message. The TFT mechanism in the BitTorrent protocol determines if a particular peer is to be unchoked. For a leecher A to unchoke another leecher B, the peer B needs to have uploaded content to peer A at an attractive rate. For a seed C to unchoke leecher B, the peer B needs to have downloaded content from peer C at an attractive rate.
Besides the four leechers unchoked based on performance, one other leecher is picked randomly from the set of interested neighbors every 30 seconds. This random selection, called optimistic unchoking, enables the peer to explore the neighborhood, thereby improving the chance of picking the best performing peers. Once unchoked, a leecher attempts to obtain the piece that is rarest in its neighborhood. This “rarest piece first” policy results in diversity elaboration among the neighbors, e.g., the “rarest piece” becomes possessed by an increasing number of peers.
The tracker randomly assigns neighbors from the set of previously seen peers. By default, a maximum of 50 neighbors are defined, but the protocol allows the client to request more neighbors from the tracker. Further, it is possible that a peer receives connection requests from other peers that are not yet its neighbors. Thus, the neighborhood of a peer is typically a random subset of the overall peers in the swarm.
In peer-to-peer networks where the neighbors from which clients download content are pre-determined, randomly decided neighbors may cause an Internet Service Provider (ISP) to incur increased expenses because of increased cross-ISP traffic. This increased cross-ISP traffic can result in significantly increased power consumption costs. The report “ESTIMATING TOTAL POWER CONSUMPTION BY SERVERS IN THE U.S. AND THE WORLD” by J. G. Koomey, Staff Scientist, Lawrence Berkeley National Laboratory, incorporated herein by reference in its entirety states “Total power used by servers represented about 0.6% of total U.S. electricity consumption in 2005. When cooling and auxiliary infrastructure are included, that number grows to 1.2%, an amount comparable to that for color televisions. The total power demand in 2005 (including associated infrastructure) is equivalent (in capacity terms) to about five 1000 MW power plants for the U.S. and 14 such plants for the world. The total electricity bill for operating those servers and associated infrastructure in 2005 was about $2.7 B and $7.2 B for the U.S. and the world, respectively.” The power consumption as documented in this report is constantly increasing with the additional traffic generated by file-sharing.