Graph analysis is a type of data analysis where the dataset is modeled as a graph. Graph analysis is used to identify arbitrary relationships between data entities. By applying certain graph analysis algorithms on a graph, a user may be able to discover non-immediate insight about the data set as analysis may consider even indirect relationships between data entities.
Many different data sets can be represented as graphs. For example, friendship relationships in a social network naturally form a graph. Real-world graphs, such as social network graphs, exhibit different characteristics than classic graphs, such as trees, meshes, and hyper-cubes. As an example of a characteristic, real-world graphs show power-law degree distribution, this means that most vertices in the graph have only a small number of edges, while a few vertices have an extremely large number of edges. For example, according to the degree distribution of Twitter's follower graph, about 96% of all vertices have less than 100 edges, while about 0.01% of all vertices are connected to 25% of all edges in the graph, with roughly one hundred vertices having more than 106 edges. These types of vertices are referred to as super high-degree vertices.
Graph analysis programs are parallelized by exploiting their inherent vertex-parallelism. In other words, a certain function is applied to every vertex in the graph in parallel. Often the “vertex function” iterates over all the edges of a vertex. Graph processing systems may make use of this vertex-parallelism. Graph processing workload may be distributed across multiple server nodes that make up a cluster of server nodes. By distributing the workload over multiple server nodes, each server node is able to implement graph processing on a separate “chunk” of vertices.
Many types of server nodes are equipped with the ability to process multiple threads at one time using multiple hardware threads and multiple software threads for each processor running the graph processing program. By doing so, each server node is able to efficiently implement vertex-parallelism on the assigned chunk of vertices. However, exploiting vertex-parallelism may lead to serious performance issues when applied to real-world graph instances. For example, a vertex function iterates over all edges belonging to a vertex. The extreme skewedness of the degree distribution leads to poor load balancing between different threads. That is, one thread deals with the super high-degree vertices and most of the other threads only deal with low-degree vertices. Such poor load balancing adversely affects the overall performance of a server node and could completely negate the positive effects of parallelization
One approach to address the issue of extreme degree distribution skewedness is to apply chunking and work stealing. In this scheme, vertices of a graph are partitioned into multiple chunks (or sets) where each chunk has the same (or similar) number of vertices. Each thread picks up one chunk and processes the vertices belonging to the thread. When a thread finishes its chunk, the thread either grabs a new chunk or, if the work queue is empty, “steals” another chunk from another thread that still has unprocessed chunks in its respective chunk queue. Although this approach somewhat reduces the load balancing problem, it is not a perfect solution. For example, if a graph contains a super high-degree vertex to which 60% of all the vertices in the graph are connected, then the chunk that contains the super high-degree vertex will cause significant workload imbalance.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.