The Web today has become an enormous source of information and users have access to a steadily increasing number of Web pages, generally linked in a non-intuitive manner. The Web is popularly referred to as “cyberspace.” However, the extent to which it constitutes a readily navigable space in the everyday sense of the word is questionable. Consequently, repeatedly reported problems in Web navigation are not knowing where you are, not knowing how to get back to previously visited information, and not knowing which sites have already been visited. The problem of users' disorientation in the Web which emerges from the high complexity of the Web environment is often referred to as the “lost in cyberspace” problem.
Various approaches have been proposed to categorize and sensibly present Web data efficiently to users. For example, the structure of the Web can be modeled as a graph wherein the nodes are HTML pages, and a hyperlink from one page to another is represented as a directed edge. An alternative are tree hierarchies. An advantage of trees is that they have much simpler structures than graphs which make them easier to display in an aesthetically pleasing manner.
A map or visualization of a Web site or other information repository reduces the user's cognitive load when trying to navigate a virtual space. That is, it reduces the burden on long term and working memory, summarizing the information about the structure and organization that would otherwise have to be remembered. Therefore, extensive effort has been put in developing methods to visually represent Web data. Pad++, Hy+, Navigational View Builder, HyperSpace, Natto, Ptolomaeus, MAPA, Disk Trees, Dome Trees, VISVIP, BrowsingGraph/BrowsingIcons, XML3D, HotSauce, MemoSpace, Grokker, and WebTracer are some of the methods specifically designed to represent Web data in order to improve navigation through the Web, reduce disorientation problems within the Web, and increase the ease and speed of exploring and retrieving pages of interest. Other methods such as Space Tree, Treemaps and Hyperbolic Tree, which were initially designed to visualize hierarchical data, have also been adapted to map Web data. All of the foregoing are described in detail by their authors in documents submitted with an accompanying information disclosure statement.
However, very few of these methods have been adopted and are currently being used as viable solutions to the lost in cyberspace problem. Reasons may include requiring a large amount of resources from the host computer (Natto, MemoSpace, Hyperspace, HotSauce, MAPA, Navigational View Builder, WebTracer), unaesthetic drawings (Ptolomaeus, Disk Trees, Dome Trees, BrowsingGraph/BrowsingIcons), inefficient use of screen space (Pad++, Space Tree, Hyperbolic Tree and XML3D), and being counterintuitive to how humans perceive relational information (Treemaps).
The following sets forth in more detail the deficiencies of some of the foregoing and other Web mapping applications.
Pad++ lacks the ability to show which Web pages have already been visited and Web pages that will arise in the future. In addition, Pad++ does not make efficient usage of the screen space.
Hy+ does not make efficient usage of the screen space. Another drawback is when a user clicks the “Back” and “Forward” button in the Web browser, the edge in the visualization representing this action is omitted. Omitting this action fails to answer the “where have I been?” question.
Navigational View Builder uses a database-oriented hypermedia system, which over time becomes out-of-date. Also, it does not make efficient use of space.
HyperSpace uses an adapted browser and separate program to extract links from visited pages. Other drawbacks of HyperSpace are that the links and sphere nodes are heavily occluded, browsing history is not tracked, and the system is not synchronized with a Web browser.
Natto limits the number of nodes that may comfortably occupy the flat plane (occlusion issue), and the range of pages is fixed.
Ptolomaeus shows only the Web pages that appear in the visualization after the Web crawler completes the Web page retrieval process. Also, another drawback of Ptolomaeus is in its inefficient use of space.
MAPA uses labels and cards to represent the WWW and the information quickly becomes occluded. Also, MAPA is not dually synchronized with a Web browser. And, all the mapped information is stored in a database and not captured in real time.
Disk Trees uses many overlaying linking edges that occlude information. Another drawback of Disk Trees is that it is a bottom-up algorithm. That is, the whole tree needs to be processed before displaying it to the user.
Dome Trees is similar to Disk Trees in that it is a bottom-up algorithm.
VISVIP makes poor use of space and it has no clear way of labeling the boxes.
BrowsingGraph/BrowsingIcons uses a Web browser that is not completely integrated within the system. The algorithm used to draw the graph, which represents how the Web pages are related, is not space-efficient. That is, there is considerable much white space in the drawing area that is unused.
XML3D contains node/label occlusion and the distant features within the three-dimensional space are distorted. Furthermore, it contains long connecting edges between nodes. Long connecting edges in a graph are more difficult to follow than shorter edges.
Among the drawbacks of HotSauce are its difficulties in finding pages and, once immersed in the space and surrounded by blocks, it is easy to become disoriented. Another drawback with HotSauce is the frequent occlusion of labels.
MemoSpace does not make efficient usage of the screen space and labels denoting a Web page's address are large in size and occlusive.
Grokker, developed by Groxis Inc., is a Web-based tool used to visualize Web data. Grokker allows user to enter federated searches and organizes the results in two ways: outline view and map view. The map view uses a radial layout algorithm. Unlike the present invention, Grokker organizes Web data based on content relationships. The present invention creates a hierarchy of Web pages based on their location in the WWW. Another difference between Grokker and the present invention is that Grokker visualizes a broad range of pages stemming from the user's query. In contrast, the present method visualizes a particular area in the WWW starting from a user-specified Web page.
WebTracer uses a system in which Web crawling and visualization are separate and not integrated synchronized processes. WebTracer possesses other drawbacks: (1) the user can click on an atom (Web page) and the Web page appears in the computer's default Web browser; (2) it does not make efficient usage of the screen space, and (3) it contains many edge intersections, which makes it harder to understand the Web pages' relationships. Indeed, the web visualization produced by WebTracer appears as a disorienting three-dimensional “starburst”. While user can manipulate the image to view it from any desired direction, the image itself is static and unanimated. A comparison between the present solution and WebTracer was performed by the inventors. The instant solution and WebTracer were used on the same computer, using the same Internet speed, and starting from the same Web page. The advantageous results of the present solution compared to WebTracer were as follows:                Computer Memory (RAM)—27% more efficient.        Computer processing (CPU)—50% more efficient.        Web crawling speed—63% faster.        