Graphs provide a powerful formalism and visual representation for modeling objects and their relationships. Informally, a graph is simply a collection of vertices or nodes, pairs of which are connected by edges. More formally, a graph is a set of vertices with an adjacency relation between vertices. The edges may be undirected (i.e., symmetric) or directed (i.e., asymmetric). In addition, weights may be attached to the nodes, in which case the graph is called a network.
One type of graph is a k-partite graph. The definition of a k-partite graph is a graph whose vertices can be partitioned into k disjoint sets so that no two vertices within the same set are adjacent. See Deo, N., 1974, entitled Graph theory with applications to engineering and computer science, P. 168-169, Prentice-Hall, Inc: Englewood Cliffs, N. J. A special case occurs when k=2; this type of graph is called a bipartite graph. In a bipartite graph there are two sets and each node is a member of one set. Further, all of its connections are to nodes in the other set. Bipartite nodes are important in social network analysis where they are called two mode networks, affiliation networks, or actor networks See Borgatti, A. & Everett, M. G., 1997, entitled Network analysis of 2-mode data, available online at http://www.analytictech.com/borgatti/2mode.htm. Commonly an affiliation network is used to see the relationships between a group of people via a set of events in which they participate. When modeling or graphing these relationships one set of nodes or mode, is the people. The other set of nodes or mode is the events. Whenever a person participates in an event there is an edge connecting the two. The affiliation networks express the social relationships of the people involved, so that using them properties can be derived about the people and the events. For example, which event attended by the most people, which person went to the most events, and which people and events are most central, i.e., do the best job of tying together the group.
A key problem in analyzing social networks is acquiring and storing the data. Often this data is accumulated manually by having people fill out surveys summarizing their participation in events then these data are tabulated. Alternatively, traces of people's social behavior can be gleaned from computer records. For example, a system called Netscan referenced in Xiong, R., Smith, M. A., and Drucker, S. dated October 1998, entitled Visualizations of Collaborative Information for End-Users, Microsoft Technical Report No. MST-TR-98-52, also online at: research.microsoft.com/˜sdrucker/papers/collabvizchi99.doc) automatically scans Usenet archives and associates authors with the messages they post. This graph is a 2-mode or bipartite since the nodes can be divided into two sets, further, a node in one set only connects to nodes in the other set. These graphs are visualized to help Usenet users trace through connections between authors and their postings.
The field of graph visualization, a subfield of information visualization, seeks to provide techniques and systems to aid in the inspection, navigation, and analysis of graphs. This includes the question of how to layout the graph so people can see the relationships between nodes and providing interfaces to allow these relationships to be dynamically manipulated. A general goal of viewing and interacting with graphs is providing the ability to focus in on regions of interest, while providing sufficient context or background to aid in the interpretation of the foreground or focal information. A good survey of graph visualization techniques divides its review into 1) Graph layout methods: Deciding where to place the nodes and links; 2) Navigation and interaction: How the user moves around the graph and manipulates it; and 3) Clustering: Simplifying the graph by grouping or aggregating nodes. See Herman, I., Melancon, G., & Marshall, M., 2000, entitled Graph visualization and navigation in information visualization: a survey in IEEE Transactions on Visualization and Computer Graphics 6(1), 24-43.).
An important operation to simplify graphs that is provided by many systems is filtering. Filtering graphs mean removing nodes according to set criteria. For example, dynamic controls can be provided that select which nodes should be retained. See Becker, R. A., Eick, S. G., & Wilks, A. R., 1995, entitled Visualizing network data. IEEE Transactions on Visualization and Computer Graphics. 1(1). 16-28).
Other powerful simplification operators use hierarchies, either implicit in the graph (intrinsic or structural) or defined elsewhere (extrinsic). These hierarchies can simplify graphs directly (e.g., only presenting the remaining hierarchy) or by providing assistance in analyzing the graph. A strict hierarchy or tree is defined as a directed graph where every node has exactly one parent or one node that points to it. A more flexible hierarchy is a directed acyclic graph where nodes many have more than one parent, but no cycles or loops exist in the graph. One system that explored the use of hierarchies for graph visualization extensively provides facilities for (1) aggregating the graph into its bi-connected components, (2) viewing a spanning tree of the graph via TreeMaps (a spacefilling version of a tree), and (3) extracting a subset of the hierarchies to show a focal node and its nearby relatives in order to provide a sense of context for the node that explains how it fits into the overall graph. See Rivlin, E., Botafogo, R. & Shneiderman, B. Navigating in hyperspace: Designing a structure-based toolbox. Communications of the ACM, 37:87-96, 1994.
A common graph visualization problem is how to label nodes in the graph. This problem is especially important when the nodes in a graph represent lengthy text objects such as word processing documents or web pages. Solving this problem is similar to finding a brief summary for a document. There are many well-known algorithms for extracting salient text units from a document collection. One approach assumes that text units with a uniform distribution over the collection of documents are not salient and should be filtered out. Another approach is to see if the frequency of a text unit in the text is high relative to its frequency in a corpus of background text. See Moens, M. F., 2000, entitled Automatic Indexing and Abstracting of Document Texts. P. 89-97. Kluwer Academic Publishers:Boston, Mass. In this technique each term, made up of one or more consecutive words, is assigned a tf*idf weight, which stands for term frequency times inverse document frequency.
These references are herein incorporated by reference in their entirety.