Graph analysis is a recently popularized methodology in data analytics. In graph analysis, a dataset is represented as a graph where data entities become vertices, and relationships between them become edges of the graph. Through this graph representation, it may be tractable to analyze fine-grained relationships between data entities.
In practice, however, data scientists may find it convenient to (temporarily and/or contextually) mutate the graph into a different form for the sake of analysis or display. For instance, when analyzing a graph composed of phone calls between people, the original graph may have many edges between vertices, as each edge may represent one phone call. However, the data scientist may want to aggregate all the phone calls between the same pair of people into a single edge (i.e. simplifying a graph having multi-edges).
Unfortunately, current graph processing frameworks are not very good at handling contextual graph mutations. For example, systems like Neo4J and GraphX do not support graph mutation at all, and a user needs to expressly rebuild the graph model. In other frameworks, graph mutations are generally inconvenient either, due to a rigid application programming interface (API) and semantics for edge properties. For example, semantics of edge properties may be undefined for when multi-edges are collapsed into one.
Although a network analysis package, iGraph, provides some functionality regarding simplification, its functionality is restricted. The user may only merge properties. Selecting edges based on a criteria is unsupported. Furthermore, API invocation may be unwieldy (unreadable and error prone) because the user should write all parameters in a long list whenever calling the mutation method.