Peer-to-peer (P2P) content distribution over the Internet is rapidly increasing. Studies have shown that P2P traffic contributes the largest portion of the Internet traffic based on their measurements of popular P2P systems. Most of this P2P traffic is contents provided by P2P users themselves. These users can join the P2P network via the Internet by connecting to active peers in the P2P network.
One problem, however, with this increase in P2P traffic on the Internet is that this type of traffic is detrimental to Internet Service Providers (ISP) because it increases an ISP's cost of doing business. This increase in the cost of doing business for the ISP is passed on to the user in the form of higher prices to access the Internet. This makes users unhappy. Thus, it is a double-edged sword for the ISP, who wants to keep prices low to attract customers but faces users wanting to exchange content using P2P content distribution.
The Internet consists of thousands of ISPs, which operate at different scales and serve different roles. Some ISPs provide Internet access to end-users and businesses, while others provide access to ISPs themselves. The relationships between ISPs can be summarized into 3 categories. First, there can be a customer-provider relationship (also called a transit relationship), which refers to one ISP purchasing Internet access from another ISP and paying for the bandwidth usage. Second, there can be a sibling relationship, which refers to the inter-connection among several ISPs belonging to the same organization. Even though each ISP might be managed separately from the perspective of network administration, traffic exchange among them does not involve any payment. Third, there can be a peering relationship, which refers to ISPs pairing with each other. Peering ISPs can exchange traffic directly that would otherwise have to go through their providers. This is a common relationship adopted to lower ISPs' payments to their own providers. To a certain extent, the traffic exchanged between two peering ISPs is free. However, when the traffic becomes highly asymmetric, one party will start charging the other based on bandwidth usage. Based on the ISP relationships, ISPs can be grouped together to form economic entities, whereby no payment is involved for traffic within an entity but traffic crossing entity boundaries does incur payment. Based on the sibling and peering relationships, such economic entities can be formed at two levels: (1) a sibling entity includes all ISPs that are siblings to each other; and (2) a peering entity includes not only all siblings, but also all ISPs that are peering with each other.
One reason that P2P traffic increases an ISP's cost of doing business is that in a P2P network peers randomly choose logical neighbors without any knowledge about the underlying physical topology. This can cause a topology mismatch between the P2P logical overlay network and physical underlying network. In a P2P system, all participating peers form a P2P network over a physical network. A P2P network is an abstract, logical network called an overlay network. This mismatch can cause a decrease in traffic to nearby peers and an increase in traffic to distant peers, which incurs additional transit costs for the ISP.
Another reason that P2P traffic increases an ISP's cost of doing business is that a P2P network without locality can increase the traffic congestion at the backbone of ISP. In the Internet, there is usually more abundant bandwidth available at the local level than at the global level. By way of example, within a home having multiple computers the computers are usually connected through a high bandwidth in-home network. In comparison, the outgoing bandwidth that is usually served by a cable modem or ADSL modem is usually one or two magnitude tighter than the in-home network. Multiple homes in a neighborhood are often connected through a switch with abundant intra-neighborhood bandwidth, while the outgoing bandwidth to the ISP gateway router may be provisioned with a rather limited bandwidth. If locality information is not used in the P2P application, the majority of the traffic will be inefficiently flow across distance. In this case, the backbone will be congested and the quality of service experienced by the end users will be reduced.
While P2P content distribution can be deployed to significantly lower the bandwidth costs of content providers, there will be a significant amount of traffic crossing entity boundaries, if the deployment does not consider the economics of ISPs. This extreme focuses solely on minimizing the bandwidth costs of the content providers. The other extreme is to restrict the P2P traffic to be contained within entity boundaries. For example, forbidding peers to share content with peers that are from a different ISP. This, however, might not fully utilize the potential of P2P. When an entity contains few peers, the sharing becomes difficult, and the content provider's bandwidth is increased accordingly. For practical P2P content distribution, it is important to strike a balance between these two extremes. In doing so, deployments will provide significant reductions in bandwidth costs to content providers and will improve the quality of service experienced by the end users without generating unacceptable levels of traffic across ISP boundaries.
The Internet can be viewed in multiple hierarchies. One step is to identify computers with different locality neighborhood. The bottom hierarchy is the in-home or in-corporation network of a single Internet IP address. Multiple computers within a home or a small corporation may be linked to the Internet through a network address translator (NAT) device. The in-home locality can be identified as multiple computers are identified to share the same external IP address.
The next level of the Internet locality is the Internet subnet neighborhood. Peers are identified that are with the same subnet gateway and subnet mask as the peers that are in the same subnet neighborhood. These are usually peers connected to a same local switch, with abundant intra-neighborhood bandwidth.
The higher level of Internet hierarchy can be viewed as a network of autonomous systems. An autonomous system (AS) is a subnetwork under separate administrative control having a common routing policy to the Internet. Within AS, the routing is done through the Internet gateway protocol (IGP). Across AS, the routing is done through border gateway protocol (BGP). Examples of ASes are networks of big companies or universities, national research networks, local or Internet service providers (ISPs), or international backbone providers. Currently, there are more than 15,000 ASes on the Internet. Each AS is connected to one or several other ASs with direct, physical links. A unique AS number (or ASN) is allocated to each AS for use in routing.
The Internet Protocol (IP) IP address of a peer can be mapped to one or more ASN for the peer. If multi-homed, an IP range can be mapped to multiple AS. This ASN can be used to determine to which Internet Service Provider the peer belongs. Typically, each AS maintains its own policies. The ISP controls the traffic with each AS and sets up policies for going outside of the AS (such as which paths to take). Current P2P content distribution schemes uses peer overlay constructions that consider each AS a separate entity. Relationships between the peers typically are not accounted for when building the peer overlay.