Graphs may be used to model many different kinds of data. In a graph database, a graph has one or more vertices connected by one or more edges. Each vertex has a value and an identifier. A graph database may represent a directed or undirected graph. In a directed graph database, each edge specifies a source vertex and a target vertex. In an undirected graph database, each edge specifies a pair of vertices with no order or name implied. An undirected graph may be modeled in a directed graph database by specifying a pair of edges with the source and target vertices reversed. Edges in a graph database may or may not be labeled to specify the kind of relation indicated by the edge between vertices.
Query over a graph database is accomplished via sub-graph pattern matching and subsequently projecting desired values out from the matched pattern for the result. A subgraph is a collection of vertices and edges that are a subset of the vertices and edges contained in the graph represented in a graph database. A sub-graph pattern is an abstract representation of a sub-graph where some vertices and/or edges have been specified as variables that should be satisfied by sub-graphs that match the sub-graph pattern. A projection of values from a sub-graph that matches a sub-graph pattern is a selection of the vertices that satisfy the variables in the sub-graph pattern.
To reduce the time required to perform a query over a graph, indexing of the edges between vertices can be used and combined with indexing of the values represented by the vertices. Current approaches use such indices to find and aggregate simple pattern matches that can be combined to generate complex query pattern results; however, this approach can be difficult to scale and must be performed largely in serial. Accordingly, for at least these reasons, it is desired to provide improved techniques for graph pattern matching and querying.