1. Field of the Invention
The present invention relates to a method for updating betweenness centrality of a graph, which updates the betweenness centrality of an unweighted graph, which comprises vertices and edges with no weights and is updated when an edge is updated.
2. Discussion of Related Art
In general, the betweenness centrality is a measure that computes the relative importance of a vertex in a graph, and it is widely used in network analyses such as a social network analysis, biological graph analysis, and road network analysis. For example, in the social network analysis, a vertex with higher centrality can be viewed as a more important vertex than a vertex with lower centrality. The betweenness centrality of a vertex in a graph is a measure for the participation of the vertex in the shortest paths in the graph.
There are many previous works on the betweenness centrality problem. The concept of the betweenness centrality is proposed in literature 1, but the definition proposed in literature 10 is more widely used. Recently, many variants of the definition are proposed in literature 6. Literature 5 improves the computation time of the betweenness centrality based on a modified breadth-first search algorithm and the dependency of a vertex, and it is the fastest known algorithm that computes the exact betweenness centralities of all the vertices in a graph. As the computation of shortest paths between all pairs of vertices are time consuming, literature 22 proposes another definition of betweenness centrality, which is based on a random walk. In literature 22, each vertex has a probability of visiting its neighbor vertices. Also, literature 7, literature 2, and literature 12 propose approximation algorithms for computing the betweenness centrality. Literature 23 and literature 25 adopt the betweenness centrality for detecting communities in a social network.
Although many works on calculating the betweenness centrality exist and the betweenness centrality is one of the major measures used in analyzing social network graphs, none of the works for computing the betweenness centrality address the problem of updating betweenness centrality.
Applying the previous algorithms to find influential users or detect communities over frequently updated graphs such as a social network graph is inefficient. This is because calculating the betweenness centralities of all users in the graph involves computing the shortest paths between all pairs of users in the graph. In all previous works, the recomputation for all the vertices is inevitable whenever a new edge is inserted to the graph. This recomputation is clearly time-consuming. As the number of edges in the social network graph increases over time [literature 19], the need for updating the betweenness centrality is evident.
It is difficult to update the betweenness centrality, because even a single edge insertion or a single edge deletion leads to the changes in many shortest paths in the graph. This change causes the updates of the betweenness centralities of many vertices in the graph. It is trivial to see that when an edge (vi, vj) is inserted to a graph, the shortest path between vi and vj is changed. Also, the shortest paths that include the original shortest path from vi to vj are changed.
Prior art research on the computation of betweenness centrality will be described in more detail below.
The computation of betweenness centrality has been gaining much importance in social network analyses, and is widely used in many applications. The earliest work to define the measure which quantifies this idea of betweenness centrality is introduced by Anthonisse et al. [literature 1] and Freeman [literature 10]. Freeman's original method of finding betweenness centrality is based on counting geodesic paths for all pairs of vertices on a graph.
Following Freeman's work, variations of centrality measures are proposed. Everette et al. [literature 17] propose a group betweenness measure which can be applied to groups and classes as well as individuals. Freeman et al. [literature 11] extend Freeman's work [literature 10] to introduce a new measure of centrality based on the concept of network flows, which considers both shortest and certain non-shortest paths. Newman [literature 22] proposes a measure of betweenness centrality based on random walks of any length instead of shortest paths.
Currently, the fastest known algorithm to compute exact betweenness centralities for all the vertices [literature 5] requires O(|V∥E|) and O(|V∥E|+|V|2 log |V|) time on weighted and unweighted graphs, respectively. Traditionally, betweenness centrality was determined by first computing the lengths and number of shortest paths between all pairs, and then summing up pair-dependencies of all pairs [literature 10]. Pair-dependency of a pair s, t∈V on an intermediary vertex v∈V is defined as the ratio of shortest paths between s and t that v lies on to all shortest paths between s and t. Brandes [literature 5] points out the weakness in this approach arguing it is computing more information than needed. The faster algorithm is presented by Brandes [literature 5], based on aggregating path counts from different source vertices in the network.
Although big improvement was made over the very initial betweenness centrality computation algorithm, many researchers argued that the Brandes' algorithm is still too costly for large graphs. In order to overcome such limitation, researchers propose approximation algorithms to compute the estimated betweenness centrality, claiming that good approximation would be an acceptable alternative to exact betweenness centrality value as long as fast computation is possible.
Brandes et al. [literature 7] propose a heuristic estimation method for betweenness centrality computation and conduct experiments with various selection strategies of the source vertices to assess the quality of the estimation. Bader et al. [literature 3] present a parallel algorithm for computing betweenness centrality, optimized for scale-free sparse graphs. They [literature 2] also suggest an algorithm to compute the betweenness centrality of a single vertex in time faster than computing the betweenness of all vertices.
Geisberger et al. [literature 12] suggest a bisection scaling algorithm for approximating a variant of betweenness centrality. Makarychev [literature 21] suggests a linear time approximation algorithm to find the ordering of the vertices that maximizes the number of satisfied betweenness constraints.
Betweenness centrality is used in diverse applications across many different disciplines. Betweenness centrality allows an understanding of the extent to which a vertex contributes in the flow of information. It is mainly used in finding the most prominent vertices in complex networks, whether they are individuals in social networks, elements in biological networks, intersections or junctions in transportation networks, physical elements in computer networks, or documents in World Wide Web.
For example, Leydesdorff [literature 20] demonstrates in his research how betweenness centrality is shown to be an indicator of the interdisciplinarity of scientific journals, and del Sol et al. [literature 8] use the betweenness centrality in identifying the most central residues in protein-protein complex structures. Jin et al. [literature 15] demonstrate an application of parallel betweenness centrality to detect potentially harmful nodes in an electrical grid. The electrical grid is an interconnected network for delivering electricity from suppliers to consumers.
Holme [literature 13] studies the relationship between betweenness centrality and the density of a traffic model, and Lammer et al. [literature 18] use betweenness centrality in approximating the importance of a road or a junction and investigated the scaling laws associated with urban road networks in Germany. In many applications, the network structures are typically not static. As the network evolves, the network graphs constantly change over time, which implies that there is a strong need for an efficient algorithm to update betweenness centrality.
Betweenness centrality is also used in community detection. Newman et al. [literature 23] propose a divisive community detection technique which iteratively removes edges with the highest betweenness centrality value from the network. Pinney et al. [literature 25] suggest an alternative community detection algorithm in which the network decomposition is based on vertex betweenness instead of edge betweenness. Newman et al. [literature 23] discuss a weakness in the existing algorithms which is a high computation cost associated with iterative recalculation of all-pair shortest paths when the edges are removed.
As observed in many applications, the dynamic nature of many real-life networks is a clear evidence that efficiently updating betweenness centrality is an important issue. However, no literature dealing with the problem of efficiently updating betweenness centrality in a dynamic network environment exists at present.