Graph analysis is a recently popularized way of analyzing associative data, which considers not only the properties of entities but also relationships between them by modeling the data set as a logical graph. Typically, a user applies several graph algorithms on such a graph data model.
When modeling the data set as a graph, the user may want to adopt different types of graphs as naturally suggested by the kind of underlying data. For example, modeling may produce a directed graph (where there is a distinction between two vertices of an edge as source and destination) or as undirected graph (where there is no such distinction).
Most graph algorithms are designed for a generic kind of graph, although some are defined only for specific graph types (e.g. directed graph). When a graph algorithm originally designed for a general graph is applied for specific type of graph, there is an opportunity for performance optimization based on manual redesign. For example, a weakly connected component algorithm designed for directed graphs can be redesigned for a connected component algorithm for undirected graphs.
However, specialized variants of an algorithm for different graph types may introduce costs. Besides being error prone, redesign imposes additional costs, such as code duplication and dual maintenance.
An alternative is to forgo dedicated optimization and instead use a unified (e.g. polymorphic) application programming interface (API) for all graph types for activities such as iterating neighbors of a vertex. A polymorphic API can mask differences in implementations of a graph algorithm, such as different ways that graph edge properties are accessed or different ways of neighbor iteration. However, using a polymorphic API introduces a significant and unnecessary runtime overhead.
Furthermore, using polymorphism may limit the optimizations a domain specific language (DSL) compiler can perform on a graph algorithm. Because a generic graph type is broader, it has less information such as metadata. With less information available, fewer assumptions can be made and, thus, fewer optimizations are available.