Graph analysis is a subfield of data analysis that generally encompasses systems and methods for analyzing datasets modelled as graphs. The graphs that are analyzed typically organize the underlying dataset into a set of nodes or vertices connected by edges, each of which may have a particular direction. A graph captures fined-grained, arbitrary relationships between different data entities within a dataset. Graphs can be used to model a wide variety of systems and relationships including, without limitation, communication networks, linguistic structures, social networks, data hierarchies, and other physical or virtual systems. By analyzing the relationships captured by a graph, data scientists, applications, or other users can obtain valuable insights about the original dataset.
Graph analysis is often performed in an exploratory manner. For instance, a data scientist may apply different analysis algorithms on the dataset (or a subset of it) in an ad hoc manner until some valuable insight about the dataset is digested. In order to support such exploratory use cases, some traditional database management systems (DBMS) and specialized graph processing systems provide command-line front-ends through which users may submit database queries and procedures. According to one such approach, a general shell application is used to submit standard query language (SQL) statements and Procedural Language/Structured Query Language (PL/SQL) blocks to a database server. Such generalized shell applications allow a user to query and perform standard database operations on graph objects, but generally do not provide any specialized support for performing graph analysis operations. Therefore, the interactivity and operability of such generalized shell applications are significantly limited.
According to another approach, a specialized shell application may be configured to support domain-specific graph languages, such as Gremlin. These shells allow users to submit commands specifically tailored for graph analysis. For instance, the graph language may support pre-defined graph operations for manipulating graph objects via graph traversals. These shells are typically built upon general interpreter frameworks and rely on the type-checking capability of the baseline systems. Generalized type-checking does not capture the nuances of a graph analysis environment, which may lead to unintentional and potentially costly errors on the part of the user. As an example, the user may apply a particular algorithm, intended for a bipartite graph, to a non-bipartite graph. If left unchecked, data may become corrupted or otherwise unreliable during graph analysis. Consequently, the user may incorrectly interpret the data and/or overlook potentially useful insights.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.