1. Field of the Invention
The present invention relates generally to file transferring in a network. In particular, the present invention is directed to streaming of files in a peer-to-peer network.
2. Description of Background Art
Content providers typically either stream audio, video, or other content directly from a server farm or employ a content distribution network (CDN), such as Akamai. A CDN comprises a set of servers or application-layer routers sprinkled about the Internet, and can thus solve problems that are not addressed by a single server farm. When a portion of the network becomes congested or experiences a localized failure, a CDN can shift users between servers or route between different application-layer routers. When the network becomes partitioned, subsets of the topology that contain a working server typically continue operating.
Although CDNs address critical problems, CDNs like the traditional server-client model require substantial infrastructure. For example, Akamai operates tens of thousands of servers all over the world. The number of servers scales linearly with the number of concurrent users.
In the last few years, a new paradigm has arisen in which end-systems form a peer-to-peer (P2P) network. An end-system acting as a peer contributes its bandwidth and storage to aid in the distribution of content among the peers. This paradigm has no inherent bandwidth scaling limit since arriving peers add more capacity to the system. Two recent examples of this paradigm are BitTorrent and Avalanche.
To encourage peers to contribute their bandwidth resources, BitTorrent uses a tit-for-tat incentive mechanism in which each peer sends to those peers that send the most to it. So that peers have content that other peers want, BitTorrent divides the file into pieces and distributes the pieces in rarest-first order. Avalanche ensures that peers have information of interest by using network coding. In either case a file is not playable until the entire file has been received.
Conventional solutions to content distribution can be divided generally into one of four categories: infrastructure-only solutions, end system multicast, peer-to-peer file sharing, and peer-to-peer streaming.
Infrastructure-Only Solutions
The traditional server-client model divides nodes into servers 102 that provide the data, and clients 104 that consume it as shown in FIG. 1(a). A corporate network or Internet Service Providers (ISPs) may provide proxy caches to reduce latency and network load by storing content near users on their networks. A CDN may achieve similar goals by replicating content out to server farms that are sprinkled about the Internet topology as shown in FIG. 1(b). For live content, a CDN cannot replicate content ahead of time, but the CDN can create an application-layer multicast tree that spans the data centers and then each data center takes end-users as leaves as shown in FIG. 1(c).
However all of these strategies require infrastructure that grows linearly with the number of users. To see this, divide all nodes into either infrastructure nodes 202 or end-systems 204 as shown in FIG. 2. The infrastructure nodes include servers, proxy caches, and application-layer routers. Regardless of how the infrastructure nodes are arranged internally, in order to serve n concurrent users at a minimum bit rate ro, the infrastructure must deliver n users worth of content at an aggregate rate of nro. Assuming each infrastructure node has fixed capacity, for large n, the number of infrastructure nodes must be at least linear in n, i.e., Ω(n).
Examples of the traditional server-client model include Real Networks' Helix server with the RealOne Player client, and Microsoft Windows Media Services with the Windows Media Player client. Examples of content distribution networks that may replicate content out to data centers before delivery include Akamai and Real Networks.
End-System Multicast
Alternatively, end-systems can arrange themselves into a tree as shown in FIG. 3(a) and forward content that they receive. This is conventionally known as end system multicast. However, no proposed end-system multicast scheme has a fully decentralized incentive mechanism. The importance of such incentive mechanisms is underscored in two studies of peer-to-peer file sharing systems that lack incentive mechanisms: for gnutella as high as 70% of the peers share zero or no files, and Napster at the height of its popularity had 20%-40% of its peers sharing few or no files.
End-system multicast is an example of the Peer-to-Peer (P2P) paradigm. With P2P, peers forward content to each other. Thus P2P networks leverage the peers' otherwise unused upstream bandwidth to add capacity to the system. For media encoded at a bit rate less than the peers' upstream capacity, each peer adds more system capacity than it consumes. In theory, a single peer or server could then serve any number of end-systems. However, to route content through an end-system requires passing the content through a potentially congested and lossy access network to forward through a node that may depart at any time. Thus, quality, rather than capacity, limits scale.
A peer exhibits peer reliance when it relies on one or more peers to provide content, but either does not allow enough time or by design is disallowed from contacting a reliable infrastructure node such as a server to provide missing or damaged content should the peers fail. The most basic form of ESM exhibits peer reliance. FIG. 3(b) omits the network cloud and shows the end-systems arranged in a tree rooted at an infrastructure node. The server has capacity C, each end-system has capacity C, and the content passes through a maximum of k end-system hops to reach the end-systems. Even if the peers use a reliable transport protocol such as TCP to retransmit lost packets at each peer-to-peer hop, a peer may still depart unexpectedly, requiring its children to spend seconds, perhaps tens of seconds, to disambiguate a burst loss from the departure of a parent and to then repair the tree. Of even more concern is that the departure of a single end-system near the root of the tree can result in a disruption for a significant fraction of the entire audience. Assuming that each node has probability of unexpectedly departing p within any given time interval, then the probability of an unexpected departure occurring in that interval is 1−(1−p)k where k is the number of intermediate hops to the source. At some depth the frequency of departures results in intolerable performance. Similar arguments can be made for packet loss. As shown in FIG. 3(c), this means that ESM also requires 0(n) infrastructure when playback quality is taken into account.
There are multiple ways of improving the robustness of ESM to packet loss and to unexpected end-system departures, each with trade-offs. For example, Splitstream and CoopNet construct multiple interior node-disjoint multicast trees spanning the receivers, i.e., a node is an interior node on only one tree and is a leaf on all of the other trees. As a result, if a node fails at most one tree is disrupted. All receivers still receive the remaining content, and if the media (audio, video) is encoded using Multiple Description Coding (MDC) the media is playable, albeit with some degradation in quality.
Unfortunately MDC has high overhead, especially for video. It is difficult to make each description independently decodable without including a significant amount of redundant information, e.g., redundant motion vectors can consume 20% of the encoded bit rate. Furthermore, even though MDC increases robustness to packet loss, this merely increases the allowable depth of the tree, and thus does not change ESM from requiring 0(n) infrastructure.
Peer-to-Peer File Sharing
BitTorrent contains one key contribution missing from prior peer-to-peer file sharing systems and proposed ESM systems: BitTorrent includes a decentralized incentive mechanism to encourage peers to share their upstream bandwidth. Using BitTorrent, the shared file is broken into pieces. Each peer opens connections with a set of peers desiring the same file, and each tells the others what pieces they have. Each peer then swaps pieces until it has a complete file. Upon obtaining the complete file, a peer that continues sharing is called a seed. To encourage peers to share their upstream capacity, BitTorrent peers engage in a rate-based tit-for-tat: each peer sends to the four peers that send the fastest to it. To allow peers to compete for those top four spots, each peer randomly admits one additional peer every thirty seconds and stops sending to the slowest.
FIG. 4 illustrates a BitTorrent peer-to-peer network, comprised of two kinds of end-systems: downloaders 402 and seeds 404. Seeds have a complete copy of the file desired by the downloaders. The downloaders download from the seeds and trade the downloaded pieces with other downloaders.
Peer-to-Peer Streaming
End-system multicast is an example of peer-to-peer streaming Recently CoolStreaming introduced the notion of a data-driven overlay network for live media streaming. The data-driven overlay arranges peers into a graph. The availability of needed data determines the direction data flows through the graph. CoolStreaming has been demonstrated delivering live TV-quality streaming (450 Kbps). However, CoolStreaming provides no incentive mechanisms for users to contribute their upstream bandwidth.
SwarmStreaming is a product offered by Onion Networks Inc., of Minneapolis, Minn., which breaks media files into pieces and distributes the pieces across a peer-to-peer network. These pieces are then delivered between peers sufficiently in-order to allow playback before the entire file has been downloaded. However, SwarmStreaming does not provide incentives for users to provide their upstream capacity.
BitTorrent Assisted Streaming System (BASS) is not quite peer-to-peer streaming, but rather employs a hybrid P2P/server-client model that scales better than the traditional server-client model and avoids peer reliance, thus providing a similar user experience to the traditional server-client model. BASS is further described in Chris Dana, Danjue Li, David Harrison, and Chen-Nee Chuah, “Bass: Bittorrent assisted streaming system for video-on-demand,” in IEEE International Workshop on Media Signal Processing 2005, October 2005, incorporated by reference herein.
To enable a peer to begin playing before it has received the entire file, BASS introduces a server 502 or other infrastructure node that provides any piece that a peer could not download from BitTorrent in time for playback, as illustrated in FIG. 5. BitTorrent is not modified except that BASS will not download missing pieces for an already played portion of the file. Thus BASS downloads pieces in rarest first order from the interval [t, T] where t is the user's current playback time and T is the duration of the file.