Large data graphs are used in many applications, such as genealogy and social networking. For example, a graph may represent a family tree (or a plurality of connected family trees), with each node in the graph representing a person in the tree, and with lines or connections (“edges’) between nodes in the graph representing relationships between the nodes. Often, data is associated with each node that not only identifies a person but has information relating to the person (e.g., in a family tree, data at a node may include information pertaining to the person's birth, death, marriage, education, occupation, address, etc., as well as records, documents and photos concerning the person). Similarly, a graph representing a social network may use nodes to represent people and use connections or edges to represent “friendships” to other people.
A graph can be useful in representing and understanding the relationships between nodes, and retrieving data that is stored at any individual node. However, because of the amount of data stored in a large data graph, analyzing the graph can sometimes be very difficult and time-consuming. As an example, if one wanted to determine the relationship between two people without knowing much about the relationships in advance, traversing a data graph to determine the connection between those two people requires the analysis of large amounts of data. A system might need to start by initially accessing the two nodes representing the two people of interest, and then accessing all the nodes (and their data) directly or indirectly connected to the two initial nodes, extending across a large portion of the graph, until a common link is found. Traversing the graph for such purpose requires a significant amount of processing and time.