In recent years, the number of applications utilizing network-based data has been increasing in the fields of social networking services (SNSs), customer relationship management, network management, bioengineering, transportation, and so on. The network-based data is data indicating elements and relationships between the elements. Examples of the network-based data include data indicating human relationships, relationships between molecules, and the so-called networks, such as the Internet, a communication network, a traffic network, and a transportation network.
The network-based data may also be represented by a graph including vertices corresponding to respective elements and edges connecting the vertices, the edges corresponding to relationships between related ones of the elements. For example, in a graph of network-based data indicating a human relationship, each human may be represented by a vertex as one element, and a relationship between the humans may be represented as an edge connecting the vertices.
When a set of vertices included in a graph G is represented by V, and a set of edges is represented by E, the graph G is represented as G=(V, E). Also, in a graph, two vertices connected through one edge are said to be adjacent to each other. When a vertex vi−1 and a vertex vi are adjacent to each other in a sequence of vertices v0, v1, . . . , and vn for an arbitrary i (1≦i≦n), a sequence of the vertices is referred to as a “path”, and the length thereof is n. In other words, a length of a path is the number of edges included in the path. The vertex v0 is referred to as a “start point” of the path, and the vertex vn is referred to as an “end point” of the path. Of paths between two vertices, the path having the shortest length is referred to as a “shortest path”, and the length of the shortest path between two vertices is referred to as a “distance”.
Graphs like those described above are grouped into an undirected graph in which the edges are not directed and a directed graph in which the edges are directed. For example, when all roads are able to be traveled in both directions, this road network may be represented as an undirected graph. However, a road network including a one-way road is able to be represented only by a directed graph. The definitions described above are predicated on an undirected graph.
Graphs of network-based data as described above may be grouped into a weighted graph in which each edge is weighted and an unweighted graph in which each edge is not weighted. For example, a railway network may be represented as a weighted graph by associating stations with respective vertices, connecting the vertices corresponding to the adjacent stations by using edges, and giving each edge a weight corresponding to the distance between the stations. Also, for network-based data representing a human relationship, when attention is paid to only the presence/absence of a relationship and the intimacy of the relationship is not considered, the human relationship may be represented by an unweighted graph. Not only a positive value but also a negative value may be given to the weight.
Meanwhile, there are increasing demands for data analysis involving, for example, extracting information important for business, management, research, and so on from the network-based data. There are also demands for data analysis for graphs representing network-based data. For example, determining the shortest path between two vertices in a graph is important for data analysis for the graph.
Three examples in which an unweighted undirected graph is effective will be described below. For example, for network-based data representing a human relationship, a graph is conceivable in which people who are friends to each other in an SNS, people that exchange email in-house, or the like are connected to each other by using edges. Now, consider a case in which one person who is represented as a vertex in the graph wishes to access another person who has no direct relationship with that person. In this case, in the graph, when the shortest path between vertices representing the two people is known, it is possible to contact an intended person through an acquaintance corresponding to a vertex on the path with the least time and effort.
Also, in a graph representing a computer network, when the shortest path between vertices is known, communication may be performed between apparatuses corresponding to the vertices through the path with which the number of communications is the smallest.
Also, by way of example, consider a graph in which vertices correspond to respective facets of a Rubik's Cube (registered trademark) and the facets between which a transition may be made by a single turn are connected by an edge. In this case, the shortest path from the vertex corresponding to one facet to the vertex corresponding to the final facet (the state in which each of all of the faces has one color) represents an optimal solution (a minimum number of moves).
As technology for determining the shortest path between vertices in a graph as described above, there have been proposed a method for representing shortest paths between vertices in a graph as a shortest-path tree, a system for determining shortest path in a weighted graph, a system for determining shortest paths in an unweighted graph.
First, a description will be given of a method for representing shortest paths as a shortest-path tree. It has been known that the shortest paths from one vertex v to all vertices other than the vertex v may be represented as a shortest-path tree.
Herein, a graph including some of vertices and edges included in a graph is referred to as a “subgraph”. A set of vertices in a subgraph is a subset of vertices in the original graph, and a set of edges in a subgraph is also a subset of a set of edges in the original graph. When there is a path from one of two arbitrary vertices in a subgraph to the other vertex therein, the subgraph is said to be connected. Also, a path whose start point and end point match each other is referred to as a “cycle”.
In the above definitions, a tree may be said to be a connected subgraph that does not include a cycle. In general, the number of edges included in a tree is n−1, where n is the number of vertices included in the tree. A tree may also be depicted in a form like a tree that is turned upside down, in such a manner that one vertex is located at the top, vertices (group) connecting thereto are located therebelow, and vertices (group) connecting the vertices (group) are further located therebelow. In this case, the vertex at the top is called a root. A vertex that does not have any vertex therebelow is called a leaf. A vertex w that connects to right below one vertex v is called a child of the vertex v, and the vertex v is referred to as a parent of the vertex w. In addition, the largest distance from a leaf to the root of a tree is called the depth of the tree. In a tree, the path from a leaf to the root is uniquely determined.
A shortest-path tree representing the shortest paths from the vertex v to all vertices is referred to as a “shortest-path tree rooted at the vertex v” and is denoted by T(v). In other words, in the shortest-path tree T(v), with respect to an arbitrary vertex w included in T(v), the path from the vertex v to the vertex w in T(v) is the shortest path. It is also known that a path from the vertex v to the vertex w in T(v) becomes unique, because of properties of the tree.
In the shortest-path tree, the shortest path from the vertex v to the vertex w may be determined by sequentially recording, during traversal of the parent of the vertex w therefrom to the vertex v, the traversed vertex or vertices from the vertex w to the vertex v as a list and viewing the list in the opposite direction. The shortest-path tree does have no redundant vertices and is thus very effective as a method for representing a shortest path.
One known example of a system for determining a shortest path in a weighted graph is Dijkstra's algorithm for determining shortest paths from one vertex to all vertices, and the maximum amount of calculation (order) thereof is O(|E|+|V|log|V|). |A| represents the number of elements included in a set A. That is, |E| is the number of edges, and |V| is the number of vertices. When calculation for the shortest path for each vertex by using Dijkstra's algorithm is applied to calculation for shortest paths from all vertices to all vertices, the maximum amount of calculation is O(|V|(|E|+|V|log|V|)).
Another known system for determining a shortest path in a weighted graph is the Floyd-Warshall algorithm for determining shortest paths from each of all vertices to all the vertices, and the maximum amount of calculation thereof is O(|V|3).
Another known system for determining shortest paths in a weighted graph is the Bellman-Ford algorithm for determining shortest paths from one vertex to all vertices. When a graph is sparse (when the number of edges is relatively small compared with the number of vertices), the Bellman-Ford algorithm is said to be slower in calculating the shortest paths than Dijkstra's algorithm, but has an advantage in that it is possible to determine the paths even when the weight is negative. The maximum amount of calculation of the Bellman-Ford algorithm is O(|E| |V|). When the calculation for the shortest path for each vertex by using the Bellman-Ford algorithm is applied to calculation of the shortest paths from all vertices to all vertices, the maximum amount of calculation is O(|E| |V|2).
As a system for determining a shortest path in an unweighted graph, there is a breadth-first search system for determining a shortest path by traversing an adjacent vertex list with breadth-first search. The adjacent vertex list is a list of vertices adjacent to one vertex. The number of edges that connect to a vertex v is referred to as a degree of the vertex v. That is, the number of vertices included in the adjacent vertex list is a degree. The maximum amount of calculation in the breadth-first search is O(|E|+|V|). When the calculation for the shortest path for one vertex by using the breadth-first search system is applied to calculation of the shortest paths from all vertices to all vertices, the maximum amount of calculation is O(|E| |V|+|V|2). A shortest-path tree may be naturally formed by applying the breadth-first search to an unweighted graph.
The above-described systems may be applied to both a directed graph and an undirected graph. There are two proposed systems that are applicable to calculation of shortest paths in an unweighted undirected graph, that is, a system for solving the shortest path problem of all vertices to all vertices without using a matrix and a system for solving the shortest path problem by using a matrix. The maximum amount of calculation in the system for solving the shortest path problem without using a matrix is as follows.
a) O(|E| |V|/log|V|) for |E|>|V|(log|V|)2 
b) O(|E| |V|log log|V|/log|V|) for |E|>|V|log log|V|
c) O(|V|2(log log|V|)2/log|V|) for |E|<=|V|log log|V|
It is also reported that the maximum amount of calculation in the system for solving the shortest path problem by using a matrix is O(|V|2.376).
Examples of related art include the following Non-Patent Documents:
“Spanning Tree—Wikipedia” Internet URL: http://ja.wikipedia.org/wiki/%E5%85%A8%E5%9F%9F%E6%9C%A8, searched online on Jul. 1, 2014;
“Dijkstra's Algorithm—Wikipedia”, Internet URL: http://ja.wikipedia.org/wiki/%E3%83%80%E3%82%A4%E3%82%AF%E3%82%B9%E3%83%88%E3%83%A9%E6%B3%95, searched online on Jul. 1, 2014;
“Breadth-First Search - Wikipedia”, Internet URL: http://ja.wikipedia.org/wiki/%E5%B9%85%E5%84%AA%E5%85%88%E6%8E%A2%E7%B4%A2, searched online on Jul. 1, 2014;
T. M. Chan, “All-Pairs Shortest Paths for Unweighted Undirected Graphs in o(mn) Time”, SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, 2006;
R. Seidel, “On the all-pairs-shortest-path problem in unweighted undirected graphs”, J. Comput. Sys. Sci., 51:400-403, 1995;
Z. Galil and O. Margalit, “All pairs shortest distances for graphs with small integer length edges”, Inf. Comput., 134:103-139, 1997; and
Z. Galil and O. Margalit, “All pairs shortest paths for graphs with small integer length edges”, J. Comput. Sys. Sci., 54:243-254, 1997.