The present invention relates generally to data processing and visualization and, more particularly, to the analysis of massive multi-digraphs.
Massive data sets bring with them a series of special computational and visual challenges. Many of these data sets can be modeled as very large but sparse directed multi-digraphs with sets of edge attributes that represent particular characteristics of the application at hand. Geographic information systems, telecommunication traffic, and Internet data are examples of the type of data that can be so modeled. Understanding the structure of the underlying multigraph is essential for storage organization and information retrieval. Unfortunately, traditional methods of dealing with graphs fail miserably when graphs are very large, for a variety of reasons. First, the substantial difference between CPU speeds and external memories causes a severe input/output bottleneck that often requires the use of an external memory algorithm. Second, there is a screen bottleneck when visualizing massive graphs caused by the simple fact that the amount of information that can be displayed at once is ultimately limited by the number of available pixels and the speed at which information is digested by the user.
In accordance with the invention, the input/output and screen bottlenecks may be dealt with in a unified manner by extracting subgraphs out of very large multi-digraphs. In accordance with a feature of the invention, a hierarchy of multi-digraph layers may be constructed on top of the input multi-digraph. Each layer represents a multi-digraph obtained from an equivalence relation defined on the edge set of the input multi-digraph. Each layer edge represents an equivalence class of edges at the previous layer. In accordance with a further feature of the invention, each subgraph is small enough to be represented visually in a variety of novel ways. Unlike conventional visual graph representations that draw graphs as nodes and edges, the disclosed visualization techniques more easily display dense subsets of edges. Where the vertex sets are hierarchically labeled, it is advantageous to provide hierarchical browsing where representations can be chosen automatically based on the properties of the data.
These and other advantages of the invention will be apparent to those of ordinary skill in the art by reference to the following detailed description and the accompanying drawings.