A web site is a collection of web pages that are interconnect to one another with hyperlinks. The web pages of a web site are accessible over the Internet using a web browsing computer program on a client device communicating with a server device that hosts the web site. A hyperlink is a link from one web page of a web site to another web site. Clicking on a hyperlink within the web browsing computer program on the client device causes the client device to acquire, or load, and display the web page to which the hyperlink links, or refers.
Web sites can contain tens, hundreds, thousands, or more different web pages. As web sites have become more complex, therefore, the designers of the web sites usually want to view structural diagrams of the web sites in order to see the underlying relationships among the web pages of the web sites. A structural diagram of a web site thus shows the relationships among the web pages of the web site, and is useful for understanding the structure of the web site.
One way to create a structural diagram of a web site is to search the web pages of the web site for all of the hyperlinks contained within the web pages that refer to other web pages on the same web site. The hyperlinks of the web site are referred to as the physical links of the web site. A structural diagram of a web site that is built on the basis of the hyperlinks that link the web pages of the web site together can capture the physical structure of the web site. However, many times the resulting structural diagram will reflect weak interrelations among the web pages of a web site that can obscure the actual semantic structure of the web site.
For example, FIG. 1 shows an example structural diagram 100 of a web site that can be created in accordance with the prior art by searching the web pages of the web site for all the hyperlinks contained within the web pages that refer to other web pages of the same web site. Each of the nodes 102, 104, 106, 108, 110, 112, and 114 of the diagram 100 includes a name of a web page and the file name of the web page. For example, the node 102 has the name “home” and the file name “index.html.” The node 102 represents a web page that links to the web pages represented by the nodes 104 and 106. The node 104 represents a web page that links to the web pages represented by the nodes 108, 110, 112, and 114.
The disadvantage to using only hyperlinks in creating the structural diagram 100 of the web site is evident from the inclusion of the nodes 112 and 114 as being pointed to by the node 104. The node 104 has the name “software,” where the web page represented by the node 104 links to two other software-related web pages represented by the nodes 108 and 110 having the names “software01” and “software02,” respectively. However, the web page represented by the node 104 also links to two hardware-related web pages represented by the nodes 112 and 114 having the names “hardware01” and “hardware02,” respectively.
These hardware-related web pages may be linked from the software-related web page represented by the node 104 due to a navigation bar or other collection of hyperlinks present on the software-related web page represented by the node 104. By comparison, the software-related web pages represented by the nodes 108 and 110 may be linked within the primary portion of the web page represented by the node 104, and not only linked within the navigation bar. As such, the hardware-related web pages represented by the nodes 112 and 114 have a weak semantic relationship to the software-related web page represented by the node 104. Inclusion of the nodes 112 and 114 within the structural diagram 100, as being pointed to by the node 104, obscures the actual semantic structure of the web site.
Another prior art approach to creating the structural diagram of a web site is to use the directory structure of the directories within which the web pages of the web site are physically stored on a server device. The directory structure of the directories within which the web pages of a web site are physically stored yield links among the web pages that are referred to herein as to the semantic links of the web site. The physical links among the web pages of a web site are represented by the hyperlinks of the web pages, since these hyperlinks physically point to the web pages of the web site, whereas, the semantic links among the web pages of a web site are represented by the directory structure of the directories within which the web pages are physically stored. This is because it is presumed that there is an underlying organization to the directory structure, in that the designer of the web site has purposefully placed given web pages in given directories. However, these links are semantic, and not physical, because there may not be actual physical hyperlinks among the web pages within given directories. Rather, the links are semantic because they represent an intended underlying organization to the web pages of the web site due to their being stored in different directories.
For example, FIG. 2 shows an example directory structure 200 of directories 202, 204, 206, and 208 within which the physical files of the web pages of the web site having the structural diagram 100 of FIG. 1 are physically stored on a server device. Directories may also be referred to as folders herein. The root directory 202 includes a hardware directory 204, a home directory 206, and a software directory 208. The hardware directory 204 includes the files 210 that represent web pages; the home directory 204 includes the file 212 that represents a web page; and, the software directory 208 includes the files 214 that represent web pages.
FIG. 3 shows another example structural diagram 300 that can be created in accordance with the prior art based on the directory structure 200 of FIG. 2. The user may have initially indicated that files having file names of “index.html” represent the base, root, parent, or primary web page as compared to the other web pages represented by files stored in a given directory. Thus, the node 104 represents the software-related web page with the file name “index.html,” such that the software-related web pages represented by files stored in the same directory 208, and having the file names “software01.html” and “software02.html,” have corresponding nodes 108 and 110 to which the node 104 links within the diagram 300. Similarly, the node 106 represents the hardware-related web page with the file name “index.html.” The hardware-related web pages represented by files stored in the same directory 204, and having the file names “hardware01.html” and “hardware02.html,” having corresponding nodes 112 and 114 to which the node 106 links within the diagram 300.
The disadvantage to using only the directory structure of the directories within which the files of the web pages of a web site are physically stored to create the structural diagram 300 is evident from the node 102, representing the web page having the name “home” and having the file name “index.html” being isolated within the diagram 300. Because the home directory 206 of FIG. 2 is within the same hierarchical level as the hardware folder 204 and the software folder 208, using only the directory structure to create the diagram 300 does not yield the node 102 pointing to the nodes 104 and 106, as does the diagram 100 of FIG. 1. Thus, whereas this prior art approach to web site structural diagram creation properly has the node 104 pointing to the nodes 108 and 110, and the node 106 pointing to the nodes 112 and 114, it does not have the node 102 pointing to the nodes 104 and 106. Therefore, the designer of the web site still does not have an accurate portrayal of the structure of the web site when viewing the diagram 300.
Furthermore, the prior art approach that utilizes the directory structure of the directories within which the files of the web pages of a web site are physically stored to create a structural diagram for the web site may have other limitations that impede the creation of an accurate structural diagram. For example, in FIG. 2, the file within the hardware directory 204 that has the file name “index.html” may have instead been named “hardware.html”. In such instance, if the prior art approach to creating the structural diagram is looking for a file having a file name “index.html” to use as the base, root, parent, or primary web page within the hardware directory 204, the resulting structural diagram will not be created correctly. That is, rather than the node 106 pointing to the nodes 112 and 114, as in the diagram 300 of FIG. 3, the nodes 106, 112, and 114 may be sibling nodes to one another, such that the node 106 does not pointing to the nodes 112 and 114.
There are thus disadvantages to using only the physical links among the web pages of a web site to create a structural diagram for the web site, where the physical links are the hyperlinks among the web pages of the web site. There are also disadvantages to using only the semantic links among the web pages of a web site to create a structural diagram for the web site, where the semantic links can be represented by the directory structure of the directories within which the web pages of the web site are physically stored. For these and other reasons, then, there is a need for the present invention.