Topology management in peer-to-peer file sharing clouds is a significant problem that needs to be addressed in order to increase the speed and ease with which all or most members of the cloud can receive content being shared.
In the past, large scale content distribution has been carried out using dedicated server farms providing infrastructure-based solutions. In this type of method, each client requiring content forms a dedicated high bandwidth connection to a server at a server farm and downloads content as required. This type of solution is costly for the content provider who must provide and maintain the server farm. FIG. 1 illustrates this type of solution having servers 1 and clients 2, each client having direct connections to one server. Not only is this type of solution costly for content providers but it is not robust in that failure at a server prevents content from being provided to many clients. In addition, the solution is not easily scalable because each server supports a limited number of clients.
More recently a new paradigm for content distribution has emerged based on a distributed architecture using a co-operative network in which nodes share their resources (storage, CPU, bandwidth).
Cooperative content distribution solutions are inherently self-scalable, in that the bandwidth capacity of the system increases as more nodes arrive: each new node requests service from, and, at the same time, provides service to other nodes. Because each new node contributes resources, the capacity of the system grows as the demand increases, resulting in limitless system scalability. With cooperation, the source of the file, i.e. the server, does not need to increase its resources to accommodate the larger user population; this, also, provides resilience to “flash crowds”—a huge and sudden surge of traffic that usually leads to the collapse of the affected server. Therefore, end-system cooperative solutions can be used to efficiently and quickly deliver software updates, critical patches, videos, and other large files to a very large number of users while keeping the cost at the original server low.
BitTorrent®, available from BitTorrent, Inc. Corp. California 612 Howard Street, Suite 400 San Francisco Calif. 94105, is an existing peer-to-peer file sharing protocol written by Bram Cohen and currently publicly available under an open source license. Under the BitTorrent® algorithm a file for distribution is split into blocks or fragments.
These blocks are distributed to nodes in a cloud in a random order and can be reassembled on a requesting node. Each node downloads missing blocks from other nodes to which it is connected and also provides an upload connection to the blocks it already has.
Despite their enormous potential and popularity, existing end-system cooperative schemes such as BitTorrent®, can suffer from inefficiencies in some situations which decrease their overall performance. Such inefficiencies are more pronounced in large and heterogeneous populations, during flash crowds, in environments with high churn, or where co-operative incentive mechanisms are in place. The present invention is concerned with ways in which network topology management and other methods can be used to reduce or alleviate some or all of these problems.