File systems are well-known means for storing and organizing computer files and the data they contain. Distributed file systems provide means for storing and organizing information on multiple computer systems which are connected with one another through a network. An important characteristic of a distributed file system is that it presents a unified view to data and files stored therein, such that all of the data can be accessed without regard to which particular computer system, or plurality of computer systems, in the network the data is actually stored on.
Another type of distributed system is a content delivery network (CDN). Unlike file systems, which are intended to permit reading and writing of data, CDNs are intended as distributed systems for the rapid delivery of immutable content (usually via the Internet), such as audio-video content. This is accomplished by strategically placing copies of the content on computer systems which are located close (logically and/or physically) to users of the content and which are therefore able to quickly deliver the content to the users. An important characteristic of a CDN is that it is read-only: once content is placed in a CDN, the content cannot be changed—i.e., only reads by users are accelerated and there are no writes. If there is even a minute change to the content, the entire copy of the revised content must be republished in the CDN by storing a fresh copy and redistributing it throughout the network.
BitTorrent is a very popular CDN algorithm for distribution of content on peer-to-peer (P2P) networks. We consider BitTorrent as a CDN algorithm as there are no means of modifying content that is distributed in the system. In the BitTorrent algorithm content is divided into chunks and different computers (nodes) on the network download chunks separately. Upon downloading a chunk, each node becomes a peer—a node capable of serving the chunk to another node. Nodes that have all chunks for a given content are called seeds. There are also tracker nodes that keep track of peers and seeds for a given content item.
Because nodes can join and leave the network arbitrarily, the distribution of chunks available at any given moment varies. Chunks for popular content will generally be readily available on the network, as there will be many nodes that have downloaded such chunks (the number of such downloads is indicative of the relative popularity of the chunks). This organic nature in the increase in availability of popular content gives BitTorrent very important read scalability where the more reads (requests for download) there are, there will be more servers (nodes that have completed downloads) to fulfill the read requests. In other words, the system is capable of naturally balancing and increasing serving capacity with increases in popularity and number of reads, in stark contrast to conventional centralized client-server systems where an increase in read loads placed on a single, or a fixed plurality of servers, will slow the system down.
As noted above, however, CDNs like BitTorrent are useful only when dealing with immutable content. In situations where both reads and writes can be expected from multiple, distributed nodes, other forms of distributed networks are required.