The present disclosure relates generally to distributed graph processing, and more specifically, subgraph-based distributed graph processing.
Analysis of graph, or network, data is relatively complex for large datasets. To meet the challenge of processing large amounts of graph data, a number of distributed graph processing systems have emerged, such as Pregel™ and GraphLab™. Such graph processing systems divide input graphs into partitions, and employ a vertex-based programming model to support iterative graph computation. In a vertex-based graph processing system, each vertex contains information about itself and all its outgoing edges, and computation is performed at the level of a single vertex. For example, in Pregel, a common vertex-centric computation involves receiving messages from other vertices, updating the state of the vertex and associated edges, and sending messages to other vertices. In GraphLab, a vertex may read or update the vertex's own data or data of its neighbor vertices. In the vertex-centric model, a vertex has limited information. Each vertex only knows about its own neighborhood, and information is propagated through neighbor vertices one hop at a time.