The present application relates generally to use of a computer with the Internet and, more particularly, methods for speeding up the process of browsing Web content in a computer system having an Internet or other on-line browser.
With the ever-increasing popularity of the Internet, particularly the World Wide Web ("Web") portion of the Internet, more and more personal computers (PC's) provide Internet access to vast stores of information through Web "browsers" (e.g., Microsoft Internet Explorer or Netscape Navigator) or other "Internet applications." Browsers and other Internet applications includes the ability to access a URL (Universal Resource Locator) or "Web" site. The URL is used to specify the location of a file held on a remote machine.
Each URL itself is composed of several distinct components. For example, the URL http://host/file.html includes three distinct components. The first component, http, specifies the protocol (here, "HTTP" or HyperText Transfer Protocol) that is to be used to access the target file. Other access protocols can be specified by a URL. For example, the URL of ftp://ftp.pgp.com/pub/docs/samples specifies access to files via "FTP" (File Transfer Protocol). This specifies a link for accessing the file directory docs/samples on the machine ftp.pgp.com.
The second component, host, indicates the name of the remote machine; this can be expressed as either a symbol name (e.g., pgp. com) or a numeric IP (Internet Protocol) address such as 123.200.1.1. The final component,file.html, provides the path name of the target file--that is, the file which the hypertext link is to be made. The file is referenced relative to the base directory in which Web pages are held; the location of this directory is specified by the person who has set up the Web server (i.e., "Webmaster").
The majority of content available on the Internet is represented in "HTML" documents which, in turn, are read or accessed by Web browsers. In particular, the HTML or Hypertext Markup Language is the scripting language used to create the documents for the World Wide Web. Although most browsers will display any document that is written in plain text, HTML documents afford several advantages. In particular, HTML documents include formatting, graphics, and "hypertext links" to other documents.
Markup languages are used to describe the structure of the document. HTML is used to mark various elements in a document, including headings, paragraphs, lists, tables, and the like. To achieve this, an HTML document includes formatting commands or "tags" embedded within the text of the document which serve as commands to a browser. Here, HTML tags mark the elements of a file for browsers. Elements can contain plain text, other elements, or both. The browser reading the document interprets these markup tags or commands to help format the document for subsequent display to a user. The browser thus displays the document with regard to features that the viewer selects either explicitly or implicitly. Factors affecting the layout and presentation include, for instance, the markup tags used, the physical page width available, and the fonts used to display the text.
The design of HTML tags is relatively simple. Individual HTML tags begin with a &lt;("less than") character and end with a&gt;("greater than") character, such as &lt;title&gt; which serves to identify text which follows as the title of a document. HTML tags are not case-sensitive (with the exception of HTML escape sequences) and are often used in symmetric pairs, with the final tag indicated by the inclusion of a / (slash) character. For instance, the &lt;title&gt;tag represents a beginning tag which would be paired with a &lt;/title&gt; ending tag. These paired commands would thus be applied to the text contained within the beginning and ending commands, such as &lt;title&gt; My Sample Title &lt;Ititle&gt;. The &lt;B&gt; tag, on the other hand, informs browsers that the text which follows is to be in bold type. This bolding is turned off by the inverse markup tag &lt;/B&gt;. In contrast to these paired or "container" tags, separator tags are used unpaired. For example, the command &lt;br&gt; is employed by itself to insert a line break. Browsers generally ignore extra spaces and new lines between words and markup tags when reading the document. In other words, "white space" characters, such as tabs, spaces, and new line characters, are generally ignored in HTML. Leaving a blank line in one's document, for instance, generally does not create a blank line when the document is displayed in a browser, unless one uses the "preformatted" HTML tag (&lt;pre&gt; and &lt;/pre&gt;). Finally, not all tags are supported by all Web browsers. If a browser does not support a tag, it (usually) just ignores it.
The attraction of the World Wide Web is of course the "rich" content which it stores, largely as a collection of these interconnected Web or HTML pages. With each passing day, the information content available on the Web is more and more graphical in nature (e.g., high use of bitmaps). Accompanying the explosive growth of the World Wide Web, for instance, is the ever increasing use of advertising material on practically any content which a user can access. This is particularly problematic since advertising material is often graphically intensive, requiring substantial time and resources for downloading and processing. Apart from advertising, many Web sites employ graphics to such an extreme degree as to render it difficult or impractical to access the Web site in real-time unless one has a high-speed Internet connection (e.g., T1 line). All told, the total download times for Web pages is becoming increasingly greater.
At the same time, the underlying infrastructure of the Web has not improved to a sufficient degree to offset this increased resource demand. Although advertising on the Web serves as one example, there exists a more general problem of how a user of the Web can exert at least some control over the content which is downloaded into his or her browser. Accordingly, there is great interest in developing techniques which speed up the process of browsing Web content or "Web surfing," including decreasing the background noise (e.g., ancillary graphics) which are not desired by the user.