Graph analysis is an important type of data analytics where the underlying data-set is modeled as a graph. Since such a graph representation captures relationships between data entities, applying graph analysis procedures can provide valuable insight about the original data-set to the user. Examples of popular graph analysis procedures are Community Detection, PageRank, Finding Shortest Paths, and Link Prediction.
Many graph data sets are so large that a single graph data set is unable to fit in a single machine's address space. Instead, a graph instance is distributed among computing nodes in a cluster of computing nodes (or “cluster device”). In this scenario, graph analysis is performed by exploiting one or more CPUs of each cluster device while the cluster devices communicate with each other through a high-bandwidth network.
A graph algorithm may be expressed as multiple iterations of computation kernels. A kernel may look like the following pseudo-code:
foreach(n: G.nodes) // for every vertex n in graph G foreach(t: n.nbrs) // for every neighbor vertex t of n  n.foo += t.bar// sum-up t.bar into n.foo
A straightforward implementation of the above pseudo-code in a distributed environment can be challenging because the above pattern requires one cluster device pulling or reading data from other cluster nodes. In one approach, each time a first cluster device requires a neighbor of a graph node from a second cluster device, the first cluster device sends a request message to the second cluster device. Therefore multiple messages are generated and passed between cluster devices. Furthermore, multiple graph nodes assigned to the first cluster device may share a common neighbor. Thus, the first cluster device may generate and send a request message for each graph node that has the common neighbor. Therefore, if there are ten graph nodes that are assigned to the first cluster device and that are connected to a particular graph node assigned to the second cluster device, then the first cluster device sends ten request messages to the second cluster device for the same data item.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.