This invention relates to publishing a hypermedia document.
A hypermedia document is a digital document that may have one or more references, or “links,” to other documents or to other locations within the same document. A hypermedia document alternatively may have no links but rather may be a standalone document formed of one or more different media types (text, images, sounds, etc.). In either case, a hypermedia document can be made accessible over a network such as the World Wide Web (web) by “publishing” it to a web server. Publishing generally refers to the process of manipulating one or several hypermedia documents into an appropriate form and placing them at an appropriate location within a network so that they can be accessed by other users. Hypermedia documents generally are part of a collection of cross-referenced documents accessible over a network. Using a web browser, a viewer can use a document's links to move from one document to another, or to view the content of a linked document referenced in another.
Examples of different types of hypermedia documents include document types typically associated with the web—e.g., HTML (hypertext-markup language) and VRML (virtual reality modelling language) documents—as well as document types such as Quark XPress documents which have applications independent of the web. FIG. 1 shows a portion of a HTML document. Such documents can be created using various types of content development environments, including word-processing applications and authoring tools, which allow an author to create a HTML document without having to understand the complexities of the HTML language.
A published HTML document typically is accessible at a unique uniform resource locator (URL) address on a network server. When a viewer accesses the URL address of the document over the web, the HTML document is displayed as a web page. The HTML document shown in FIG. 1 is stored at the URL address http://www.sgi.com/ss.home.page.html As shown in FIG. 2, when a user points a browser at the document's URL address 21, the document is displayed as a web page 20 for Silicon Graphics, Inc. of Mountain View, Calif.
A hypermedia document typically includes both “linked” content (content that does not reside within the document under consideration but which is accessible indirectly from that document via visually indicated links) and content that is displayed directly within a hypermedia document rather than simply being linked to it. A document's content can include several different types of media, including text, images, sound, video documents, 3D virtual worlds, applets (self-contained executable programs written in computer languages such as Java or Java Script), or virtually any other media type provided a corresponding plug-in (an extension mechanism for handling non-standard data types) is available to be installed on a user's browser.
Each link within a hypermedia document corresponds to a URL address associated with the linked content. In some cases, a linked document is displayed automatically within the hypermedia document. For example, the Silicon Graphics logo 22 in the upper portion of FIG. 2 is a linked image document that is displayed automatically when the HTML document of FIG. 1 is displayed. In other cases, a linked document may be displayed as a text string with distinct formatting, and the document is not accessed until a viewer clicks, or otherwise selects, the displayed link. For example, “company info & jobs” 23 (in the upper right portion of FIG. 1) is a displayed link, and when selected by a viewer, the browser accesses and displays the linked document shown in FIG. 3.
A hypermedia document may reference various types of documents at “remote” or “local” URL addresses. A URL is regarded as “local” if it resides on the same server as the document that references it, and “remote” if the URL is on a different server than the referencing document. Whether local or remote, each linked document may include nested links to other documents, either local, remote or a combination thereof. In a nested group of documents, a top level hypermedia document includes a link to another document, which in turn includes a link to yet another document, and so on, to virtually any level of nesting. The hypermedia document and its “directly” linked documents (i.e., URLs that can be accessed by a single, direct jump from the referencing document), as well as its “nested” linked documents (i.e., URLs that can be accessed indirectly by two or more jumps from the referencing document) can be represented as nodes in a directed graph. A link between documents can be represented as a parent-child relationship between nodes. In such a representation, the initial hypermedia document (such as the document corresponding to FIGS. 1 and 2) is the top level document and its linked documents are sublevel documents.
Considerable effort may be required in publishing a hypermedia document. For example, publishing generally requires an author to verify that the URLs of the directly linked documents represent valid addresses. Otherwise, a user clicking a link with an invalid address would be presented with an error message such as “URL not found.” To further increase the reliability of the hypermedia document, authors frequently also attempt to verify the addresses of the nested links. However, because nested links are at least two jumps removed from the referencing document, the mere act of identifying nested linked documents can be a painstaking process. Depending on the number of nested links that are identified, attempting to identify and verify all of the nested links can become a complex and expensive process and, potentially, an administratively unmanageable undertaking.
Further complicating the publishing process is the fact that commercially available browsers typically support only a limited subset of the universe of file formats available for the different types of media. For example, a typical browser may support only JPEG and GIF for images, WAV for sounds, MPEG-1 for movies, and VRML for 3D worlds. Thus, to ensure that linked documents are in supported formats, an author frequently must determine which formats are acceptable, create appropriately converted versions of the content in those acceptable formats, and adjust URL addresses as needed to point to the proper location of top level and sublevel documents.