The Internet is a computer communications network built on worldwide data and telephone networks. Computers connected to the Internet can exchange information with any other connected computer. FIG. 1 is a simplified illustration of the "Internet Backbone." For example, the triangle in the center of FIG. 1 may represent the three major telephone conduits that exist between Washington, Los Angeles, and New York. The backbone is founded on various sets of major telephone conduits and switches that exist across the world. These communications conduits are designed to move large. Volumes of data traffic at high speed.
Each of the major conduits terminates at a router. The routers are large, fast switches that sort the large volumes of data for local routing, much as large post offices sort mail for local delivery. Each router is connected to additional, local routing devices. Some of the local routing devices, called "points of presence" ("POPs"), provide local Internet access. For example, an Internet termination router that exists in Washington may have point of presence routers connected in Baltimore, Alexandria, etc. A router can connect as many point of presence routers as the capacity of the switching systems and the Internet permit.
In addition to point of presence routers, commercial Internet exchanges (CIX in FIG. 1) and global Internet exchanges (GIX in FIG. 1) also connect to the routers. These exchanges transfer data between Internet service providers, both nationally and internationally. When data originates on one U.S. Internet service provider with a destination on another U.S. long distance provider, the data first routes to the commercial Internet exchange where it makes the transfer between providers. A similar situation occurs when data originates in one country bound for another country. The data first passes through the global Internet exchange where it is transferred from one provider to another. Additional local point of presence routers could connect off of the point of presence routers shown in FIG. 1. However, point of presence routers (POP1, POP2, POP3, etc.) usually provide a direct local connection point for various types of computers to connect to the Internet.
Personal computers of individual residential users typically connect to a local point of presence router through a local Internet carrier. As shown at POP2 in FIG. 1, a Local InterNet Carrier (LINC) obtains a direct line to the POP2, and then provides a modem or other connection via which a home computer user dials up for connection to the Internet. When the home computer connects to the modem of the local Internet carrier, the LINC switches the home computer to the POP2, which in turn connects it to the Internet.
Another method of connecting computers to the Internet is by direct connection through a local area network (LAN) to the point of presence. This example is shown as LAN#1 and LAN#2 connections to, respectively, POP1 and POP2. Specifically, the LAN#1 connects to the point of presence through a leased data line. The computers (PCs in FIG. 1) connected to the LAN receive and transmit data to the point of the presence through the LAN.
Also attached to the LAN are a variety of different servers, three of which are shown in FIG. 1. The File Server connects to the LAN and contains the common data files used by the PCs, LAN, and other Servers. An HTTP server is a particular type of server that processes incoming and outgoing data written according to a certain Internet communication protocol, called HyperText Transport Protocol HTTP). An electronic mail server processes e-mail data written to or eceived from the Internet.
As shown in FIG. 1, the Internet interconnects every computer on the Internet with every other computer on the Internet. An Internet site, such as LAN#1 for example, typically includes certain data files (called "web pages" which are a part of the World Wide Web or simply the "Web") in its file server. The Internet site HTTP server makes those pages available to other computers on the Internet. An HTTP Server that makes web pages available on the Internet usually includes a so-called "home page," which is the starting point for outside users to navigate through the underlying Web pages serviced by the HTTP Server. These Web pages are written in a special web language called HyperText Markup Language (HTML). When a user, such as the user of the "Home PC" (emanating from POP2 in FIG. 1), wants to view an Internet site's home page such as LAN#1's home page, it can do so by requesting that data from LAN#1 over the Internet. In response, LAN#1 retrieves the web page data from its File Server and instructs its HTTP Server to transmit the data, addressed to the Home PC, onto the Internet. The data travels from local leased link to the POP1, through the Internet via necessary routers, through POP2, through the local Internet carrier, and into the modem of the Hose PC. The request for the data from the Home PC to the LAN#1, of course, travels along the opposite path.
To insure that data is sent to and received by the appropriate receiver on the Internet, every "device" (e.g., workstation, PC, HTTP Server, File Server, etc.), communicating on the Internet is assigned a unique address called an Internet Protocol (IP) address. Elements of the IP address identify where in the network a device is connected. Other parts of the IP address identify the specific device. The IP address can be analogized to a phone number that can be called by another phone number to make a connection through a series of telephone switches. The phone number has an element (three digits) that identifies the state of the resident (i.e., the area code), an additional seven digits, three of which identify the local exchange of the resident, and four digits that specifically identify the home of the resident. The IP address is presently a thirty-two bit binary address, readily processed by computers, but cumbersome for use by human users. As a result, IP addresses are assigned mnemonics to make them more "user friendly." The mnemonic consists of two parts: a host name and a domain name. It is this mnemonic representation of the IP address which is commonly used by Internet users to access Web sites. Conventionally within the World Wide Web, the mnemonic "WWW" is used to represent the host name. The remaining portion of the mnemonic represents the domain or network where the host resides. For example, WWW.UCLA.EDU, identifies a host named "WWVW" in the domain (network) "UCLA.EDU."
FIG. 2 shows an address line written in the standard protocol used by Internet components to address each other and usually is used in the context of addressing a specific web page. The protocol is referred to as a "Uniform Resource Locator" (URL) and this terminology appears as the opening argument in the address of FIG. 2. In FIG. 2, the Uniform Resource Locator indicates that the request is for "http" formatted data, (i.e., a web page as opposed to, for example, an e-mail message). The home page for the data resides on the "zWWW" HTTP server on the "ucla.edu" LAN (or domain). The name of the file (to be found most likely in the file server supported by the ucla.edu LAN) is "homepage.html." If the ucla.edu LAN is LAN#1 of FIG. 1 and a PC user at LAN#2 wants to view the "homepage.html" file, the user sends the address shown in FIG. 2 to LAN#1 through the Internet channels shown in FIG. 1. Upon receipt of the address, LAN#1 returns to the user the "homepage.html" file over a reverse path through the Internet.
Once a user has received an "HTML" formatted file corresponding to a web page, the text of the displayed file may prompt the user to request additional information contained in different web page files. The prompts are referred to as "hypertext" and usually show up on a home page (or other web page) in a different color than normal text, thus distinguishing them as hypertext links. Hyperlinks may include any kind of hypertext or other hypermedia link from one HTML page to another HTML page in the current web site or in some external web site. Hypertext Markup Language is the computer language used to "compose" and represent information on a web page. As an example, a user requesting a local zoo home page may use several different hypertext links to files containing information on various animals at the zoo, a map of the zoo, operating times, etc. By clicking a mouse pointer on the hypertext, the user is automatically "transported" from a current web page to a new web page linked to that hypertext.
When the user clicks on a hypertext link, the user's data processor records the position of the computer pointer when the click occurred. The processor then uses a look-up table of x-y coordinates versus URLs to identify a new URL address assigned to the position of the computer pointer. The URL address may be serviced by the same domain or a different one, depending on the information contained in the look-up table. When the hypertext is selected, the browser requests a connection to the HTTP server hosting the file, and it also requests from the HTTP server the file identified by the URL. Once the HTTP server accepts the connection requested by the browser, the HTTP server proceeds to transmit back to the browser the requested file. Once the browser receives the requested file, it delivers or presents the content of the file to the requesting user.
The Internet provides a vast wealth of information. But the challenge is how to find a specific item of information hidden in that vast wealth. Anyone who has "surfed" the Internet knows that informational treasures can be found following some unusual routes leading to the discovery of the information. One of the most popular forms of surfing the Internet is the World Wide Web. In a sense, the Web is a client/server application that helps the user access various HTML pages available at various Internet sites. Its function is to display documents and to make links between items of information available. The user then chooses which links to follow as the user pursues a course through various Web pages. An Internet web site or simply web site refers to an entity connected to the Internet which supports Web communications and/or web files. A typical web site will include an HTTP server and one or more HTML pages (sometimes referred to as web pages).
A Web browsing session is similar in some respects to rummaging around in a flea market or a badly organized library. There is no doubt that you will discover much more than you realized, but there is some doubt whether you will find what was originally sought. On the one hand then, Web browsing is an enjoyable activity; on the other hand, Web browsing can be frustrating because it is difficult to easily target and go directly to a particular informational resource.
Hypertext links in a document allow a reader to jump from one object to another object within the document and to objects outside of the document. As a result, reading becomes a series of jumps to non-sequential points in the text rather than line by line of text. Hyperlinks between documents create an informational space with no formal pathways. A user browses starting from one HTML web page and simply explores from there. Consequently, no two paths through the web are likely to be the same. But the ability to know what informational resources are available and go directly to the specific information needed is lacking.
In fact, when browsing the Web, it is easy to become lost in the maze of hyperlinks. A hyperlink jump may take the reader up or down any number of levels or just as easily to another web server anywhere in the world. When entering a new web page, the user finds himself at a location chosen by the author of the previous web page. For example, consider a user viewing the "WOMBAT" home page. The user points to an article about wombats on an on-line magazine. After reading the article, the user then returns to the original WOMBAT home page without realizing that the magazine also included an article on wallabies, a subject in which the user also happens to be interested. Alternatively, the link may be to the magazine's home page rather than directly to the Wombat article. The user must then sift through a series of listings until he finds the issue having the Wombat article.
FIG. 3 shows a very simple example of how a web site is configured including a home page and a plurality of Hypertext Markup Language (HTML) pages each of which may contain one or more hyperlinks. FIG. 3 shows only a few hyperlinks between various ones of the HTML pages. As a user clicks on one hyperlink in the home page, he is transported to one of the three HTML pages in the first branch or tier. Page 1 may have for example a hyperlink which returns the user to the home page or a hyperlink which forwards the user to Page 4 or 5. Page 4 has a return link to the home page while Page 5 has a return link to Page 2. Even with this extremely simple example showing only one or two hyperlinks per page for just a few pages, it is apparent that a user can quickly get lost in the maze of HTML pages accessible through the HTTP server of that web site. This maze is complicated and enlarged when a hyperlink takes the user to another HTTP server at another web site which could theoretically be anywhere else in the world.
What is needed is a quick way to identify the various types of information available at an Internet web site having web pages without actually having to go to that site or browse through its HTML pages. It would also be desirable to traverse through various web pages at one or more sites without having to tediously "click" multiple times the browser "BACK" and/or "FORWARD" buttons. Instead, it would be very helpful to have a graphical representation or map of one (or more) Internet web sites that reveals the structure and content of the Internet web sites so the user knows what information is provided on each page and what web pages are linked to other web pages.
It is an object of the present invention to provide a web user with a 2-dimensional, 3-dimensional, "virtual reality," or other graphical representation of the structure of one or more web sites to permit a user to navigate through the web site efficiently using the structure.
It is a further object of the present invention to permit a user to represent objects in a web site and links to/from those objects in a hierarchical tree structure for display.
It is a further object of the present invention to provide the user with a map which provides the user with an easy to read, graphical image of where the user is in a web site and where he may go, e.g., at which HTML page the user is currently located and the links to other HTML pages supported by the HTTP server at that site.
It is a further object of the present invention to provide a Web user a map which through graphical symbols and/or text provide information about the content, size, estimated time to download the page, date last updated, whether the page has changed since the user last visited the site, and other information which would make it easier for the user to grasp the nature of the Web site and thus make decisions as to where to go on the Web site.
It is a further object of the present invention to provide a web site mapping mechanism which parses the various objects contained in a web site and organizes those objects and links between those objects into an organized fashion (e.g., a hierarchical structure).
The present invention solves these problems and meets these and other objects with an Internet navigational mapping system. In essence, the navigational map gives a user in a condensed graphic image the structure and clues to the content of one or more web sites that allows the user to navigate through the web site using that structure. At a basic level, the map represents how various objects at a web site are linked to other objects. Any object found at a Web site such as web pages, graphics, audio clips, animation, etc. may be mapped.
Therefore, the Web navigational map in accordance with the present invention is much like a road map. Instead of having to drive from Washington, D.C. to New York over various interstates, highways, and secondary roads following only signs along those roadways which indicating that you are heading towards New York, the map allows the user to view the map (without even getting in his car) to determine beforehand the most efficient and direct road route to New York as well as to check out various other options in which the user may be interested. For example, in viewing the map the driver may decide that it is important first to stop off in Philadelphia before travelling on New York. If the driver had simply followed the road signs on the interstates, there may not have been signs on the roads actually being driven by this driver pointing to Philadelphia. Accordingly, the present invention allows the user to visualize paths through one or more web sites to various destinations without having to actually follow/explore those paths to know that they exist and where they lead.
The Web navigational mapping system has two central components: a map maker and a map viewer. The map maker generates a navigational map of objects and links present at a web site. The map maker arranges the various hyperlinks between objects in the web site in an easy to read, hierarchical fashion, and in the preferred example embodiment, this hierarchy resembles a tree. Each branch of the tree provides textual and graphical information which describes the content of the web object the branch represents. The text describes the substantive content of the object. One or more icons relate to various characteristics of the object including the type of object, (e.g., an HTML file), an in-line picture, an external link, the size of the file, etc. Both the text and icons help the user know what information is available at the web site and visualize where the user is and where he may go in the site.
The web site navigational map is stored in a map database and can be loaded locally or downloaded by a network browser such as NETSCAPE.TM.. The map viewer retrieves the web site map from the database and displays the map. The user then has a clear picture of what is in the site, where objects are located, and how objects are interlinked. The user also has a graphical structure that shows the user where he is in the site and where he may go. The user may jump to any site object directly simply by selecting the corresponding map entry. Using this displayed hierarchical representation of the web site, the user knows in advance the content of a web page. In other words, the user does not have to actually travel through that web site unless it contains something the user wants to see.