Centrality is widely-used for measuring the relative importance of nodes within a graph. For example, who are the most well-connected people in a social network? Or who are critical for facilitating the transmission of information in a terrorist network? Which proteins are the most important for the lethality of a cell in protein interactions in a biological network? In general, the concept of centrality has played an important role in the understanding of various kinds of networks by researchers from computer science, network science, sociology, and recently emerging ‘computational social science.’
Traditionally, centrality has typically been studied for networks of relatively small size. However, in the past few years, the proliferation of digital collection of data has led to the collection of very large graphs, such as the web, online social networks, user preferences, online communications, and so on. Many of these networks reach billions of nodes and edges, requiring terabytes of storage.
Centrality in very large graphs poses two key challenges.
First, some definitions of centrality have inherently high computational complexity. For example, shortest-path or random walk betweenness has complexity of at least O(n3), where n is the number of nodes in a graph. Furthermore, some of the faster estimation algorithms require operations that are not amenable to parallelization, such as all sources breadth-first search. Finally, it may not be straightforward or even possible to develop accurate approximation schemes. In summary, centrality measures should ideally be designed with scalability in mind from the outset. Traditionally, this has not always been the case. However, with the recent availability of very large networks, there is a clear need for scalable measures.
Second, even if a centrality measure is designed in a way that avoids expensive or non-parallelizable operations, developing algorithms that are efficient, scalable, and accurate is necessary and not straightforward.
Clever solutions are required to satisfy these problems.