P2P technique is a hot topic in the field of Internet application these years. It provides a new and efficient way for Internet users to share resources. Node selection is a key issue in P2P network. When a data node wants to exchange a resource (i.e., data item) with other data nodes, it may issue a request to a server in the P2P network. The server may find the data nodes having this resource, and select some nodes therefrom to return to the requester. Then the requester may download the desired resource directly from these nodes.
The existing P2P system usually selects some nodes in random from nodes having a resource. Such a method cannot make efficient use of the network. For example, tremendous network traffic goes through the backbone of the Internet Service Provider (ISP), which imposes a large transmission load on the backbone. Or, network traffic is frequently transmitted cross ISP's, which results in a lot of cross-network (cross-ISP) traffic. Furthermore, such a method also affects the quality and performance of the service provided, because even if there are nodes with low communication delay in the P2P network, the random node selection mechanism may select other nodes having high communication delay.
To solve this problem, a new P2P system based on location information has been proposed. When making node selection, this method will preferentially select “adjacent” nodes. The problem with the random node selection mechanism is solved by selecting adjacent nodes. Thus, the network is efficiently used, cross-network traffic is reduced, and application performance is improved.
In Chinese Patent Application Publication CN101018172A entitled “Method for Optimizing P2P Transmission within Metropolitan Area Network” published on Aug. 15, 2007 (Document 1), an optimized method for P2P applications for a metropolitan area network is disclosed. In Document 1, by adding topology servers and indexing servers, attempt is made to restrict P2P traffic to occur at the edge of the network to reduce the transmission load on the backbone and hence avoid network congestion caused by P2P.
In P4P: Provider Portal for Applications, Haiyong Xie, Y. Richard Yang, Arvind Krishnamurthy, Yanbin Liu, Avi Silberschatz, in Sigcomm 2008 (Document 2), a new architecture called P4P is proposed, which enables more efficient cooperation between a P2P application and the ISP for controlling network traffic. This mechanism can reduce cost of the ISP, while maintaining or even improving performance of an existing P2P application.
In Chinese Patent Application Publication CN101237467A entitled “Mobile P2P Network Resource Discovering Method Introducing Vector Locating” published on Aug. 6, 2008 (Document 3), a mobile P2P network resource discovering method which introduces vector locating is disclosed. In this method, a polar coordinate locating theory is introduced to divide the entire cellular network by home region, distance and direction and establish a new routing table containing location vector information, to thereby enable quick and accurate locating for a cellular network and bi-directional look-up of resources.
FIG. 1 shows the structure of the metropolitan area network described in Document 1. As a data node in a P2P network, a user computer generally first accesses a building switch, then connects to a cell switch, then connects to an access layer switch or router (referred to as “access switch” hereinafter), in turn accesses an aggregation layer switch or router (referred to as “aggregation switch” hereinafter), and finally connects to a core switch or router. A core network of the metropolitan area network is generally constituted of one or more core switches or routers. Typically, a building switch constitutes a subnet.
The optimized method for P2P transmission within the metropolitan area network in Document 1 includes: 1) when a P2P node wants to download a resource (this node is called requesting P2P node), querying for a list of P2P nodes having this resource through a P2P indexing server (these nodes are called resource P2P nodes); 2) finding nodes most adjacent to the requesting P2P node on the network from the resource P2P nodes; and 3) downloading the resource from the one or more most adjacent resource P2P nodes on the network by the requesting P2P node.
The algorithm for selecting adjacent nodes for the requesting P2P node is as follows: first selecting nodes attached to the same building switch as the requesting P2P node, then selecting nodes attached to the same cell switch as the requesting P2P node, then selecting nodes attached to the same access switch as the requesting P2P node, and next selecting nodes attached to the same aggregation switch as the requesting P2P node. If the number of the selected nodes is less than the requested number, then all the nodes having the resource are considered as adjacent nodes. This method makes P2P traffic be as far from the core network as possible and be transmitted at the edge of the network, and thus reduces P2P traffic flowing through backbone networks such as core networks and aggregation networks.
To find adjacent nodes, the switches or routers in the metropolitan area network need to be coded. As shown in FIG. 1, s1 is the code for an aggregation switch, the value of which is from 1 to n1; s2 is the code for an access switch, the value of which is from 1 to n2; s3 is the code for a cell switch, the value of which is from 1 to n3; and s4 is the code for a building switch, the value of which is from 1 to n4. The codes for the switches or routers gone through on the way from the core network to each computer constitute the location vector (location information) of the computer: S=(s1, s2, s3, s4).
The calculation method for finding adjacent nodes is as follows. The location vectors of two nodes are defined as S=(s1, s2, s3, s4) and S′=(s1′, s2′, s3′, s4′) respectively. Then, the distance vector D between these two nodes is:
                    D        =                ⁢                  (                                    d              ⁢                                                          ⁢              1                        ,                          d              ⁢                                                          ⁢              2                        ,                          d              ⁢                                                          ⁢              3                        ,                          d              ⁢                                                          ⁢              4                                )                                        =                ⁢                              (                                          s                ⁢                                                                  ⁢                1                            ,                              s                ⁢                                                                  ⁢                2                            ,                              s                ⁢                                                                  ⁢                3                            ,                              s                ⁢                                                                  ⁢                4                                      )                    -                      (                                          s                ⁢                                                                  ⁢                                  1                  ′                                            ,                              s                ⁢                                                                  ⁢                                  2                  ′                                            ,                              s                ⁢                                                                  ⁢                                  3                  ′                                            ,                              s                ⁢                                                                  ⁢                                  4                  ′                                                      )                                                            =                    ⁢                      (                                                            s                  ⁢                                                                          ⁢                  1                                -                                  s                  ⁢                                                                          ⁢                                      1                    ′                                                              ,                                                s                  ⁢                                                                          ⁢                  2                                -                                  s                  ⁢                                                                          ⁢                                      2                    ′                                                              ,                                                s                  ⁢                                                                          ⁢                  3                                -                                  s                  ⁢                                                                          ⁢                                      3                    ′                                                              ,                                                s                  ⁢                                                                          ⁢                  4                                -                                  s                  ⁢                                                                          ⁢                                      4                    ′                                                                        )                          ,            where when si=si′, di=0, and when si≠si′, di=1.
The method for comparing the sizes of two distance vectors D=(d1, d2, d3, d4) and D′=(d1′, d2′, d3′, d4′) is defined as:
when d1=d1′, d2=d2′, d3=d3′, and d4=d4′, D=D′;
when d1=1 and d1′=0, or                d1=d1′, d2=1, and d2′=0, or        d1=d1′, d2=d2′, d3=1, and d3′=0, or        d1=d1′, d2=d2′, d3=d3′, d4=1, and d4′=0, D>D′.        
The smaller the distance vector D is, the more adjacent the two nodes are on the network.
The location information table in the topology server stores the topology information of the metropolitan area network, as shown in Table 1.
TABLE 1Location Information TableAccessCellBuildingSubnet IPAggregationSwitch CodeSwitchSwitch CodeAddressSwitch Code s1s2Code s3s4. . .. . .. . .. . .. . .10.30.11.65/26235410.30.11.129/262355. . .. . .. . .. . .. . .
In Document 1, location information of a computer in a P2P network is expressed by the specific physical location of the computer in the metropolitan area network by coding the switches or routers in the metropolitan area network and constituting the location vector of the computer by the codes of the switches and routers gone through on the way from this computer to the core network. Such expression will be called as hierarchical-coding-based location information expression hereinafter.
This hierarchical-coding-based location information expression is limited in terms of both accuracy and scalability.
Accuracy means whether the nodes returned by the system are indeed the nodes “adjacent” on the network and whether node selection thus made can not only improve application performance but also make more efficient use of the network. For example, in the hierarchical coding method employed in Document 1, when a plurality of nodes are on the same level of the tree, the distances between these nodes and the requesting node cannot be compared. In the example shown in FIG. 1, if a node under an access switch requests for 10 nodes having a data item, and there are 4 nodes under this access switch having this data item, then these 4 nodes are selected as adjacent nodes. There are still 96 nodes having this data item under the aggregation switch to which the access switch belongs, and the distances between these 96 nodes and the requesting node cannot be compared by the hierarchical coding method, although some of the nodes are more adjacent on the network to the requesting node.
Scalability means whether the system can be conveniently extended to a network of a larger scale or even the entire Internet. The hierarchical-coding-based location information expression method is suitable for a metropolitan area network in which topology information is completely known (i.e., a metropolitan area network in which it is known through which switches or routers each computer is connected to the core network). However, it is not easy to extend this method to a larger network or even the entire Internet, because in a larger network it is very difficult to know topology information of all parts of the network.
In addition, hierarchical coding typically requires to make pre-setting as to how many levels are used to express location information and what information is expressed by each level. For example, in the method of Document 1, location information is expressed by four levels of aggregation switch, access switch, cell switch and building switch. This also restricts the flexibility and scalability of this method.