The invention relates to mapping the structure of a collection of resources in a computer system.
The rapid growth in the popularity and accessibility of the Internet has led to an explosion in the amount and types of information available by computer. Seemingly every business and interest group maintains a World Wide Web site to provide information and/or entertainment for Internet users. Each Web site typically is a collection of computer resources, such as executable programs, HTML pages, JPEG and GIF image files, audio and video files, gateways, and service applications (e.g., electronic mail, file transfer protocol (FTP), news, gopher, telnet, and WAIS), maintained by one or more server computers.
The organization of information in a Web site is determined by the hyperlink structure of the resources contained in the site. For example, a site typically includes a top-level resource, or "home page", containing hyperlinks to other resources, which in turn typically contain hyperlinks to other resources. Each hyperlink typically consists of the Uniform Resource Locator (URL) of the corresponding resource and is presented to the user as a graphical object (e.g., a piece of text or an image). To access a particular resource in the site, a user must enter the site through the home page and navigate through a maze of hyperlinks, selecting the corresponding graphical objects until the desired resource is found. Typically, this navigation requires the assistance of an Internet service provider (ISP) or on-line service provider (OSP), the cost of which is determined by the amount of time the user accesses the service.
The administrator of a Web site also must navigate through the site to ensure the integrity of the site. For example, the site administrator generally can find a broken link (e.g., a link to a resource that no longer exists or has changed locations) only by unsuccessfully attempting to access the resource through the link or by receiving notice from another user that the link is invalid. Likewise, a site administrator typically learns of links to objectionable material by discovering the links himself or by hearing from a concerned user.