A peer-to-peer (also termed P2P) computer network is a network that relies primarily on the computing power and bandwidth of the participants in the computer network rather than concentrating computing power and bandwidth in a relatively low number of servers. P2P computer networks are typically used for connecting nodes of the computer network via largely ad hoc connections. The P2P computer network is useful for many purposes. Sharing content files containing, for example, audio, video and data is very common. Real time data, such as telephony traffic, is also passed using the P2P network.
A pure P2P network does not have the notion of clients or servers, but only equal peer nodes that simultaneously function as both “clients” and “servers” to the other nodes on the network. This model of network arrangement differs from the client-server model in which communication is usually to and from a central server. A typical example for a non P2P file transfer is an FTP server where the client and server programs are quite distinct. In the FTP server clients initiate the download/uploads and the servers react to and satisfy these requests from the clients.
Some networks and channels, such as Napster, OpenNAP, or IRC @find, use a client-server structure for some tasks (e.g., searching) and a P2P structure for other tasks. Networks such as Gnutella or Freenet use the P2P structure for all purposes, and are sometimes referred to as true P2P networks, although Gnutella is greatly facilitated by directory servers that inform peers of the network addresses of other peers.
One of the most popular file distribution programmes used in P2P networks is currently BitTorrent which was created by Bram Cohen. BitTorrent is designed to distribute large amounts of data widely without incurring the corresponding consumption in costly server and bandwidth resources. To share a file or group of files through BitTorrent, clients first create a “torrent file”. This is a small file which contains meta-information about the files to be shared and about the host computer (the “tracker”) that coordinates the file distribution. Torrent files contain an “announce” section, which specifies the URL of a tracker, and an “info” section which contains (suggested) names for the files, their lengths, the piece length used, and a SHA-1 hash code for each piece, which clients should use to verify the integrity of the data they receive.
The tracker is a server that keeps track of which seeds (i.e. a node with the complete file or group of files) and peers (i.e. nodes that do not yet have the complete file or group of files) are in a swarm (the expression for all of the seeds and peers involved in the distribution of a single file or group of files). Nodes report information to the tracker periodically and from time-to-time request and receive information about other nodes to which they can connect. The tracker is not directly involved in the data transfer and is not required to have a copy of the file. Nodes that have finished downloading the file may also choose to act as seeds, i.e. the node provides a complete copy of the file. After the torrent file is created, a link to the torrent file is placed on a website or elsewhere, and it is normally registered with the tracker. BitTorrent trackers maintain lists of the nodes currently participating in each torrent. The computer with the initial copy of the file is referred to as the initial seeder.
Using a web browser, users navigate to a site listing the torrent, download the torrent, and open the torrent in a BitTorrent client stored on their local machines. After opening the torrent, the BitTorrent client connects to the tracker, which provides the BitTorrent client with a list of clients currently downloading the file or files.
Initially, there may be no other peers in the swarm, in which case the client connects directly to the initial seeder and begins to request pieces. The BitTorrent protocol breaks down files into a number of much smaller pieces, typically a quarter of a megabyte (256 KB) in size. Larger file sizes typically have larger pieces. For example, a 4.37 GB file may have a piece size of 4 MB (4096 KB). The pieces are checked as they are received by the BitTorrent client using a hash algorithm to ensure that they are error free.
As further peers enter the swarm, all of the peers begin sharing pieces with one another, instead of downloading directly from the initial seeder. Clients incorporate mechanisms to optimize their download and upload rates. Peers may download pieces in a random order and may prefer to download the pieces that are rarest amongst it peers, to increase the opportunity to exchange data. Exchange of data is only possible if two peers have a different subset of the file. It is known, for example, in the BitTorrent protocol that a peer initially joining the swarm will send to other members of the swarm a BitField message which indicates an initial set of pieces of the digital object which the peer has available for download by other ones of the peers. On receipt of further ones of the pieces, the peer will send a Have message to the other peers to indicate that the further ones of the pieces are available for download.
The substantial increase in traffic over P2P networks in the past few years has increased the demand for P2P caches and also for alternative P2P management techniques. In particular there is a need to ensure that those pieces of the digital object required are preferably available with required access times. Furthermore there is a need to ensure that management techniques can ensure that bandwidth is used most effectively and cost-efficiently.