The internet is a global network of interconnected networks and computers that enables users at computers coupled to the networks to exchange information including data, images, voice, recorded sound, and electronic mail. Computers connected to the internet may be classified as client computers “clients” and server computers “servers”. Servers are repositories of information and internet users access the information present at the servers with client computers.
A common method of accessing information on the internet is with a browser residing on a user's client computer. The browser enables a user at the client computer to communicate and exchange information with a server computer over the internet using widely available protocols, including the HyperText Transfer Protocol (HTTP).
Information at the server is stored in various ways, including in one or more files, database entries, or records composed in HyperText Markup Language (HTML), Extensible Markup Language (XML), Standard Generalized Markup Language (SGML) or other tag-based schemes. For the purposes of this application we use the term source, file, database, object, stream, and node, synonomously.
Each of these sources is composed by an author who desires to convey information including text and objects such as pictures, sound, video or any combination thereof to a user's browser in a structured manner. An HTML file, an example of which is shown in FIG. 2, is composed in text and is readable by a person. It includes a plurality of markup tags interspersed throughout which specify: a logical structure; a format for presentation of text and embedding objects; and hypertext links. The hypertext links are references to other HT files which allow authors to structure their information into managable-sized pieces called “web pages” that can be transmitted efficiently from server to client computers and presented without undue delay. In effect this structure of hyperlinked web-pages permits the author to construct larger bodies of related information that can be “navigated” by a user of a client computer and effectively delivered on demand a piece at a time. Also, because the quantity and type of information on the internet is vast, the hypertext link capability is desirable in that it allows an author not only to contribute his own information to internet users via an HTML file, but also provides the capability to make relevant or related information provided by other authors readily accessible from within the HTML file.
Typically, when an internet user's browser is connected to a server, the browser displays a “web page” which is a rendered HTML file. The page provides the user with information about the capability to link to other pages through hypertext links. Once a user determines which hypertext link to follow, the user issues a command to the browser to take the link.
This command causes the browser to request the information from the server and source specified by the hypertext link. In response to the browser's request, that server sends HTML to the users client machine using Hypertext Transfer Protocol (HTTP). A server's response to a hyperlink as perceived by the user of the client computer may take from several seconds to several minutes to transit. During this apparent delay, the user's browser receives the data, the browser parses the tags encoding the material, requests additional information from servers based on the parsed tags, and renders information as a page on the user's computer based on the information and tags received.
The browser parses the source material in order to determine the structure of the document as defined by the tags included in the document. As the tags are being parsed, the browser determines what additional actions need to be taken based on the tags. In many cases information beyond that which is contained in the single source is required to render the page. For example, some tags specify that an image is to be imbedded in the page at a particular place in the page. However, the image to be imbedded may be located in a different file on the same server or in a file on a different server coupled to the internet. Other required objects may also include additional material to control formatting of the screen into multiple areas (called frames in HTML) or data used to activate a programmatic action on the client computer (generally called scripts). Therefore, the browser must request that a copy of that material, referenced by the tags in the document, be sent in order to complete the rendering of the page.
As the browser parses the source material, it attempts to render the received data as a page. The rendering occurs in response to the browser determining the structure of the document based on the tags, corresponding text, and imbedded objects. Because the tags are interspersed throughout the document stream, the browser may have to receive data from several sources before any of the information is actually rendered on the user's computer.
The aforementioned method of sending information from servers to clients over the internet, and parsing and rendering the information at the client is slow and inefficient for numerous reasons. First, the tagged document material and other data that constitute hypertext material (like web-pages) are frequently larger than they need to be to convey information, because the material is stored in readable (ASCII) text form and contains uncompressed binary objects. The large size of these sources causes longer transmission time for the hypertext material and globally results in wasted internet bandwidth. Additional delay is caused by processing at the browser, which must parse the tags in the hypertext material as it is received prior to rendering the page.
There is a need to process hypertext information at the server so that the information contained within the material is stored as in a preprocessed form that is smaller in size and more quickly transmitted and rendered by the user's browser. It is also desirable to include in the stored format those objects specified for embedding within the hypertext material so that the user's browser does not have to request additional data in order to render the information.