Many types of data analysis on large data sets call for, or would benefit from, a graph-oriented analysis. A graph is a data structure comprising a collection of data objects called vertices and a collection of vertex-to-vertex connections called edges. Data in which objects have relationships with other objects are naturally analyzed in graph format.
In conventional graph-oriented analyses, computations follow an iterative and propagative procedure. The conventional computation begins with an initial set of active vertices and edges. Each iteration includes a selection of a subset of the vertices and edges—which are adjacent to the active set—to become the active set for the next iteration. Thus, the computation conceptually travels through the graph, walking step by step from a vertex to an adjacent vertex.
In many applications, a drawback of conventional graph data computation is the very large number of computational steps. A typical computation needs to consider each possible path from a source vertex to one or more destination vertices. As the path length increases or the total number of vertices increases, the number of paths increases at an even faster rate. Due to the high number of paths to consider when processing a large data set, conventional graph data computational systems may be too slow.
In view of the foregoing, a need exists for an improved system for distributed computation of graph data in an effort to overcome the aforementioned obstacles and deficiencies of conventional graph-oriented analysis systems.
It should be noted that the figures are not drawn to scale and that elements of similar structures or functions are generally represented by like reference numerals for illustrative purposes throughout the figures. It also should be noted that the figures are only intended to facilitate the description of the preferred embodiments. The figures do not illustrate every aspect of the described embodiments and do not limit the scope of the present disclosure.