Counting triangles in a graph has increasingly become an important task that is used in many domains. A single triangle in a graph indicates that three nodes in the graph are related to each other. For example, A is connected to B and C and B is also connected to C.
Knowing the number of triangles in a graph by itself is useful in some applications. However, an accurate understanding of the number of triangles in a graph serves as a building block for other graph analysis or graph mining tasks. For example, given the number of triangles in a graph, one can discover the clustering coefficient of a graph, which coefficient is a measure of the “community-ness” of the graph. The clustering coefficient is calculated by dividing the number of closed triangles by the sum of the number of closed triangles and the number of open triangles. A “closed triangle” is one where all three nodes are related to each other (e.g., A is connected to B and C and B and C are also connected to each other) while an “open triangle” is one where two nodes are connected to another node but not to each other (e.g., A is connected to B and C, but B and C are not connected to each other).
However, the actual computation time for counting (or estimating) the number of triangles in a graph is very large when implemented in conventional systems. For example, one implementation involves over 1,000 instances of Hadoop nodes and requires more than six hours to computer the number of triangles in a particular Twitter graph. Such hardware requirements and latency is unacceptable.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.