Online social networking services provide users with a mechanism for defining, and memorializing in a digital format, their relationships with other people. This digital representation of real-world relationships is frequently referred to as a social graph. Many social networking services utilize a social graph to facilitate electronic communications and the sharing of information between its users or members. For instance, the relationship between two members of a social networking service, as defined in the social graph of the social networking service, may determine the access and sharing privileges that exist between the two members. As such, the social graph in use by a social networking service may determine the manner in which two members of the social networking service can interact with one another via the various communication and sharing mechanisms supported by the social networking service.
One of the challenges in maintaining social graphs is understanding the relationships and connections developed by members of the social networking service. In particular, exploring and enumerating relationships within member-to-member social graphs often poses sizeable computational problems. One such computational problem is enumerating relationships among members that are connected by two degrees (i.e., at least one intervening member that is common to a first member and a second member). In social networking, members are said to be first-degree connections when the members are directly connected without any intervening connections (e.g., a direct connection between Alice and Bob). Two members are said to be second-degree connections when there is at least one intervening connection between such members (e.g., Alice and Bob are directly connected; Bob and Charles are directly connected; Alice and Charles are thus second-degree connections).
As social graphs represent members as nodes and connections as edges, exploring social graphs is often performed using graph processing techniques. For example, there are two typical approaches to enumerating second-degree connections within a social graph. The first involves treating the enumeration as a “join” problem and using traditional join processing approaches However, this first approach is vulnerable to skew. The second approach involves partitioning the social graph along its vertices, and then sorting the edges within each partition by its corresponding source vertex (e.g., source node). For each source vertex, and for every pair of neighbors [ni, nj] in its list of first degree connections, this approach involves generating [ni, n, nj] as a candidate second-degree “hop”. Under these two approaches, once the second-degree connections have been enumerated, performing the GROUP BY operation involves shuffling the edges or repartitioning the social graphs, which is typically a very resource and computationally heavy step.