An overlay network is a network which forms and configures a virtual link depending on an object of its higher layer, using an existing link. That is, an overlay network is a higher layer network configured regardless of topology of a lower layer in a computer network. For example, it refers a P2P (Peer to Peer) network which is configured regardless of topology of an IP network.
In the overlay network, a flexible network superior in fault tolerance can easily be configured without requiring a central server. Nodes (peers) perform freely participate in or secede from the network, and the network topology changes dynamically. All the nodes have equal authority and each node can access data which is stored in all the nodes connected to the network.
In the overlay network, when a node has only information (IP addresses) on the adjacent node, a method which relays a query to the adjacent node is adopted to search for data. This search method is superior in ad hoc characteristic because it is not necessary to maintain network topology for search, and it is also superior in fault tolerance by utilizing replication. However, it requires a longer period of time for search and it also lacks scalability. On the other hand, when each node has information of all nodes on the network, the search time is not taken longer, but it is not appropriate in terms of scalability and ad hoc characteristic. Thus, it is important how to search for a node which stores requested data in the overlay network (P2P network).
As a high-speed (Efficient) technique to search for enormous data, a technique in which each node on the overlay network has a Distributed Hash Table (DHT) as route information is employed. As for the distributed hash table, the data are placed on hash space, and each node is in charge of a certain range and maintains and manages data in the range. When a predetermined node searches for data through a key, the key is converted into a hash value by a hash function to perform mapping on the hash space, and then objective node and data are acquired.
A plurality of algorithms are considered as a search algorithm (routing algorithm) using the distributed hash table, Chord and Koorde are included therein. The Chord is an algorithm that implements search efficiency of hop number log (N) by performing a search method similar to a binary search based on a finger table (Finger Table; a table in which node information on IDs, 1, 2, 4, 8, . . . from the own node is registered) on the annular hash space. The Koorde is an algorithm using a deBruijn graph for routing in stead of using a finger table of the Chord. Data search method by the Koorde is described in non-patent document 1. Hereinafter, the data search method described in non-patent document 1 will be explained.
(1) The First Related Art:
FIG. 1 is a flow chart which shows a data search procedure of the first related art. Also, FIG. 2 is an illustration which shows the de Bruijn graph. In the first related art, It is determined which node in the graph shown in FIG. 2 manages the data given a logical identifier Key according to a procedure described in a flow chart shown in FIG. 1.
FIG. 2 is referred to as the deBruijn graph. The deBruijn graph Dk (n) is a graph in which bit strings X whose bit length is n and is made of elements of the set Zk={0, 1, . . . , k−1} has an ark with respect to all bit strings obtained by bit-shifting all the bit strings to the left and adding x0εZk to the least significant bit. In an example shown in FIG. 2, k is two, n is four.
That is, in the deBruijn graph, in case there are 2n nodes having a logical identifier (node ID) of n bits, links are provided from each node m to nodes of (2m mod 2n) and (2m+1 mod 2n). For example, from a node whose logical identifier of 3 bits is xyz (the ID of the node is xyz), one-way links are provided to a node whose logical identifier is yz0 (the ID of the node is yz0) and a node whose logical identifier is yz1 (the ID of the node is yz1).
In the Koorde, a routing of the deBruijn graph is performed utilizing the one-way link. For example, when a message is transferred from a node whose logical identifier of 3 bits is xyz to a node whose logical identifier of 3 bits is abc, a message is routed to nodes in the order of the logical identifier xyz, yza, zab and abc. That is, the message is transferred to the node whose logical identifier is obtained by sequentially shifting xyzacb to the left, which is a connection of the logical identifier of the initial node and that of the terminal node.
As an example to search for a node managing a logical identifier of data, a case where a node 213 whose logical identifier is 13 (binary indication 1101) in FIG. 2 searches for a node storing the data whose logical identifier is 9 (binary indication 1001) is described.
In a processing shown in FIG. 1, the node 213 searches for “1001”, which is a value of a logical identifier key, using a key shift kshift. It is assumed that a value of the initial key shift is the same as that of the logical identifier key of the data. That is, the value of the initial key shift is “1001”. First, the node 213 determines whether the own logical identifier m is the same as the logical identifier key (step S101). Because the value of the logical identifier of the node 213 is “1101”, and this is different from “1001”, which is the value of the logical identifier key (No of step S101), the node 213 sets as a node t, a logical identifier represented by 2m+b mod N, using a first bit b of the key shift kshift and a logical identifier space N (here, because of 4 bits, it is 16) (step S103). Specifically, using the first bit 1 of the key shift “1001” and the logical identifier space 16, the node t becomes 2×13+1 mod 16=11 (binary indication 1011). As described above, in binary indication, this calculation is the same as the processing to bit-shift m to the left and to give the first bit of the key to the least significant bit.
Then the node 213 bit-shifts the value of the key shift kshift “1001” to the left and changes the value to “0010” (step S104). Then the node 211 whose logical identifier is 11 (binary indication 1011) searches for the value of the logical identifier Key “1001” with the value of the key shift kshift “0010” (step S105). This is corresponded to the processing (steps S101-S105) shown in FIG. 1 is moved to the node 211, and a processing of step S101-S105 are performed in a node 211 as the subroutine of the step S105 performed in node 213.
Because the logical identifier “1011” of the node 211 itself is different from the logical identifier Key value “1001” of data (No of step S101), a value of a node t becomes “0110” based on zero, which is the first bit of the key shift “0010”, and the logical identifier space 16 (step S103), and a value of the key shift kshift becomes “0100” (step S104). The processing is transited from the node 211 to a new node 206 whose logical identifier is “0110” (step S105). Thereby, the node 206 starts to search for the logical identifier Key value “1001” with the key shift kshift value “0100”.
Because the logical identifier “0110” of the node 206 itself is different from the logical identifier Key value “1001” of data (No of step S101), a value of a node t becomes “1100” based on zero, which is the first bit of the key shift “0100”, and the logical identifier space 16 (step S103), and a value of the key shift kshift becomes “1000” (step S104). The processing is transited from the node 206 to a new node 212 whose logical identifier is “1100” (step S105). Thereby, the node 212 starts to search for the logical identifier Key value “1001” with the key shift kshift value “1000”.
Because the logical identifier “1100” of the node 212 itself is different from the logical identifier Key value “1001” of data (No of step S101), a value of a node t becomes “1001” based on one, which is the first bit of the key shift “1000”, and the logical identifier space 16 (step S103), and a value of the key shift kshift becomes “0000” (step S104). The processing is transited from the node 212 to a new node 209 whose logical identifier is “1001” (step S105). Thereby, the node 209 starts to search for the logical identifier Key value “1001” with the key shift kshift value “0000”.
Because, the logical identifier “1001” of the node 209 itself is the same as the logical identifier Key value “1001” of the data (Yes of step S101), the node 209 determines that the own node is a node managing the data having the corresponding logical identifier. Then, the node 209 returns to the node 212, a message indicating that the own node has the logical identifier “1001”, as a result of step S105 in the node 212 called as a subroutine (step S106).
In step S106 of the node 212, this result is further returned to the node 206 which has called the subroutine. In step S106 of the node 206, this result is returned to the node 211 which has called the subroutine. In step S106 of the node 211, this result is returned to the node 213 which has called the subroutine. Thus, this result about the target node 209 is finally returned to the node 213 which has performed a search.
By performing the above procedures, as shown in the graph of FIG. 2, while a degree, which is the number of nodes referring to other nodes (the number of nodes adjacent to each node, namely the number of nodes that each node links to), is constant (two), a node managing Key of particular data can be found among a total number of N nodes, at the transferring number of log N (the number of hops).
(2) The Second Related Art:
In the first related art, the theoretical algorithm using the de Bruijin graph regardless of an overlay network is explained. However, because each node in the Chord hash space actually exists with an interval (i.e., nodes do not exist in all of the logical identifiers 0-15, when the logical identifier space is 16 (fourth power of two)), the de Bruijin graph in which nodes have all logical identifiers is not applicable. Thus, in the Koorde, an ID on the de Bruijin is assumed to be a virtual ID on the Chord hash space, and a node is in charge of IDs in a certain section. As described above, non-patent document 1 also describes a configuration in which the Koorde is applied to the overlay network.
FIG. 3 is a block diagram which shows a system configuration of the second related art. FIG. 4 is a flow chart which shows a data search procedure in the second related art. FIG. 5 is an explanation drawing which shows an embodiment of a message flow between peers in the second related art.
As shown in FIG. 3, the data search system in the second related art including a plurality of peers 310, 320 and 330 having an address on a network 300. Note that, each peer 310, 320 and 330 is corresponded to the node in the first related art. The peers 310, 320 and 330 have different logical identifiers. More particularly, each peer 310, 320 and 330 includes a peer logical identifier storage 318 and a local data storage 311, a routing table 312, message transferring means 313, communication means 314, left bit shift means 315, first bit acquisition means 316 and registration/search executing means 317.
The peer logical identifier storage 318 stores logical identifiers to distinguish the peer 310 from other peers in an overlay network. The local data storage 311 stores data which the peer 310 manages among data shared with the peers in the overlay network.
The routing table 312 stores a logical identifier of another peer and an address on the network 300 which is necessary to access the peer (cf. a table 500 shown in FIG. 5). For example, an Internet Protocol (IP) address may be used as this address.
A d node and an s node are stored in the routing table. When a logical identifier of each peer is assumed to be m, the d node is a node (peer) corresponding to a “predecessor” whose logical identifier is 2m. Here, the “predecessor” whose logical identifier is 2m is a node (peer) which exists first in the counterclockwise direction from the view of the logical identifier 2m in the ring shown in FIG. 5. This predecessor is in charge of managing data in the hash space till the logical identifier 2m as a responsibility domain. Here, when the size of the logical identifier space is assumed to be N, the logical identifier 2m is referred to as 2m mod N using a congruence equation. In an example shown in FIG. 5, a logical identifier of a peer 516 is 13 (binary indication 1101), and the d node is a peer (node) 515 whose logical identifier 10 (binary indication 1010).
Also, when a logical identifier of each peer is assumed to be m, the s node is a node (peer) corresponding to a “successor” whose logical identifier is m. Here, the “successor” whose logical identifier m is a node which exists first in the clockwise direction from the view of the logical identifier m in the ring shown in FIG. 5. In an example shown in FIG. 5, the logical identifier of the peer 516 is 13 (binary indication 1101), and the s node is a peer (node) 517 whose logical identifier is 15 (binary indication 1111). The logical identifier size N is 16 in this example, but in case the size is greater than this, a processing is performed as 2m mod N, using a congruence equation of mod.
The left bit shift means 315 performs a left bit shift processing based on the algorithm of the Koorde (processing to shift bit strings of m by 1 bit to the left, and to give the first bit of the key to the least significant bit). The first bit acquisition means 316 acquires the first bit of the key shift, and calculate a virtual node i based on the acquired first bit and the logical identifier space.
The registration/search executing means 317 registers and searches for data. The message transferring means 313 transfers a registration request of data and a search message (search request). The communication means 314 transmits to and receives from other peers, the search message via the network 300.
The data search system with such a configuration in the related art operates as follows.
When the registration/search request of data is provided from the outside (e.g., an external user interface), the registration/search executing means 317 of a certain peer m (e.g., the peer 516 with the logical identifier 1101 shown in FIG. 5) calculates a logical identifier where the data is stored, using a hash function. Here, a logical identifier “key” and a key shift “kshift” are set to this calculated logical identifier, and a virtual node i is set to a value of adding 1 to the logical identifier of the peer. For example, it is assumed that a hash value is 0111 (binary indication) in case of searching for a registration destination of data called “Foo”, the registration/search executing means 317 calls the message transferring means 313 using these values as arguments.
In the message transferring means 313, each processing of the flow chart shown in FIG. 4 is performed. In the example above, at first a value of a logical identifier key “0111”, a value of a key shift kshift “0111” and a virtual node “1110” are provided to the message transferring means 313.
The message transferring means 313 determines whether the logical identifier key is more than m and less than or equal to the successor (step S401). When the logical identifier key is within this range (Yes of step S401), it is determined that the successor is in charge of the logical identifier key. In this case, the search result corresponding to the search request is returned to the peer being the search origin (step S402). When it is not within this range (No of step S401), the message transferring means 313 determines whether the virtual node i is in this range (range more than m and less than or equal to the successor) (step S403).
When the virtual node i is not within this range (No of step S403), the message transferring means 313 calls the successor of the peer m using the arguments (the logical identifier key, the key shift kshift, the virtual node i) which are the same as arguments when the processing is performed in the peer m (step S407). Thereby, the successor performs the processing (steps S401-S407) to search for the logical identifier Key with the key shift kshift and the virtual node i.
When the virtual node i is within this range (Yes of step S403), the first bit acquisition means 316 of the peer m calculates 2m+b mod N, using the logical identifier m of the peer and the first bit b of the key shift kshift, and sets the virtual node i to the logical identifier of the calculation result (step S404). Here, N is the size of the logical identifier space.
Then the left bit shift means 315 shifts the key shift kshift to the left by 1 bit (step S405) and calls a peer which is the d node of the peer m using the acquired logical identifier key, the key shift kshift, and the virtual node i as arguments (step S406). Thereby, the peer being the d node searches for the logical identifier Key with the key shift kshift and the virtual node i (steps S401-S407).
In the example above, as shown in FIG. 5, first, the message transferring means 313 in the peer 516 whose logical identifier is “1101” is called with a value of the logical identifier key “0111”, a value of the key shift kshift “0111” and the virtual node i “1110” (521 of FIG. 5). In this case, because the virtual node i exists within the range (1101, 1111) (Yes of step S403), the virtual node i is calculated as “1100” (step S404), the key shift kshift is converted into “1110” (step S405), and a message is transferred to the peer 515 being the d node (step S406; 522 of FIG. 5).
In the peer 515, because the virtual node i “1100” exists within the range (1010, 1101) (Yes of step S403), the virtual node i is calculated as “1001” (step S404), the key shift kshift is converted into “1100” (step S405), and a message is transferred to the node 512 being the d node (step S406; 523 of FIG. 5).
In the peer 512, because the logical identifier key value “0111” does not exist within the range (1010, 1101), (No of step S401) and the virtual node i “1001” does not exist within the range (1010, 1101), (No of step S403), a message is transferred to a peer 513 being the successor of the peer 512 (step S407). In this peer 513 because the logical identifier key value “0111” exists within the range [0110, 1001] (Yes of step S401), it is determined that the successor manages the logical identifier key value “0111”. The message transferring means 313 of the peer 513 returns the search result to the peer 516 being the search origin (step S402).
As described above, even in case the number of existing nodes is less than the size of the logical identifier space as shown in FIG. 2, this second related art achieves a constant degree and a logarithmic hop count. Note that, the constant degree means the number of addresses of other peers recognized by each peer is constant regardless of the total number of peers, and the logarithm hop count means that the number of hops from a peer transferring a message to a destination peer is O (log (n)) when the total number of the peers is n.    Non-patent document 1: M. Frans Kaashoek and David R. Karger, “Koorde: A simple degree-optimal distributed hash table”