1. Field of the Invention
The present invention is related to the field of displaying generalized graph structures. Specifically, the present invention is related to generating a tree structure representation for display of a generalized graph structure, and the present invention is related to displaying tree structure representations of a generalized graph structure. The present invention addresses the problem of laying out large directed graphs, such as World Wide Web sites, so that the important relationships are exposed.
2. Discussion of the Related Art
The World-Wide Web (xe2x80x9cwebxe2x80x9d) is perhaps the most important information access mechanism to be introduced to the general public in the 20th Century. As larger numbers of organizations rely on the Internet to distribute information to potential consumers and investors, they also realize its potential for distributing and organizing large volumes of data for later retrieval by employees and business partners. A company""s web site is rapidly becoming one of its most important business investments.
As an information repository, a web site generally receives a high amounts of usage. Web site usage patterns that are derived by monitoring how the company""s employees use its web site enhance the companies understanding of its business activities. For example, monitoring what product literature the sales force is downloading may be a way to forecast sales. In short, traditional market analysis can be applied to this information resource.
Analysts are interested in not just how the web pages are used, but also the context under which they are placed, such as the linkage structure and the web page content. A web site is a dynamic structure, because its topology as evidenced by its linkage structure, the contents of its pages, and its usage changes continually. Analysts want to be able to analyze the evolving web site.
Because of analysts"" increasing desire to discover and understand user"" access patterns, relationships between web page contents, and to efficiently structure web site"" topology, a need exists for a set of visualization tools which aid in the process of web site analysis.
Displaying large and complex generalized graph structures is a non-trivial task. Conventional approaches of generating a tree structure representation of the generalized graph structure include depth first search and breadth first search which attempt to solve the problem by forming hierarchies based upon the topology of the generalized graph structure.
Web site administrators and designers have a need to understand the relationship between the web site""s usage parameters and its link topology, and vice versa. Since web sites are dynamically changing over time, maintains need to understand how changes to the topology affect usage. Although some conventional web site display methods encode usage information in the visualization, no conventional methods reference usage information in generating the structure to be displayed from the generalized graph structure. Moreover, no conventional system modifies the positioning of nodes based within a displayed structure upon the nodes usage.
A conventional technique for understanding a complex generalized graph structure is to display a representation of the links and nodes which constitute the generalized graph structure. One view of the World Wide Web is that of a generalized graph structure, with web pages representing nodes and hyperlinks representing the links between the nodes. Because of the complexity of the generalized graph structure as evidenced by the large number of links between nodes, some links of the generalized graph structure are usually not presented in the representation so as to enable a viewer or user to effective cognitively process the representation. An object of the present invention is to ensure that the method for generating the representation for display includes the more important links in the representation. Another object of the present invention is to ensure that the method for displaying the representation positions nodes or links according to their importance. This allows a viewer or user to understand the importance of a node or link based upon its position in displayed representation. According to the present invention, the representation of the generalized graph structure used for display is a tree structure.
According an aspect of the present invention, usage parameters are referenced in generating the tree structure from the generalized graph structure. This aspect is applicable to several types of usage parameters, and is applicable to several methods for generating the tree structure. For example, frequency, recency, spacing of accesses, and path information are each types of usage parameters that can be referenced according to the present invention.
An example of generating the tree structure from the generalized graph structure by referencing usage parameters is a breadth first traversal method of the graph that references usage parameters associated with each node or link. The usage parameters which are associated with each node are referenced in order to determine the visitation order in the graph traversal method. The visitation order is determined by visiting the highest used nodes first. Thus, the child nodes are visited in order of decreasing usage parameter. Thus, popular web pages are favored over less popular ones. A child node will be claimed by a more popular web page rather than by its less popular sibling which also has a hyperlink to the child node. Alternatively, the visitation order is determined by visiting the nodes having the highest used link. As another example of generating the tree structure from the generalized graph structure, a depth first traversal of the graph references the usage parameters.
According to another aspect of the present invention, the method of displaying the tree structure references the usage parameters to determine the positioning of the nodes in the layout of the tree structure. In a preferred embodiment of this aspect, the root node is positioned in the center of the layout.
In an example of the preferred embodiment of this aspect, sibling nodes are spread out on links which emanate radially about their parent. Thus, this example includes a squashed cone tree layout. In this squashed cone tree example, the highest-used sibling nodes can be placed farthest apart from each other so as to achieve optimal separation so that they have the most growth space. The lowest-used nodes are then placed in the remaining space between the high-usage nodes. This method continues to place nodes the farthest apart from each other based upon usage values, around their parent.
In another example of the preferred embodiment of this aspect, sibling nodes are positioned at the same radius from the root node. Thus, this example includes a disk tree layout. In this disk tree example, each leaf node in the hierarchy may be assigned the same amount of angular space. The layout angle of each node may be a function of the ranking of the node""s usage parameter relative to its siblings. Therefore, sibling nodes are laid out in order of monotonically related layout angle and usage parameter, so that the highest used nodes are positioned near each other and the lowest used nodes are positioned near each other.
Moreover, derived usage parameters such as need probability, cocitation clustering, or functions of both node and link usages can alternatively be used according to the present invention.
These and other features and advantages of the present invention are apparent from the Figures as fully described in the Detailed Description of the Invention.