1. Technical Field
The present invention relates in general to improved information processing systems. In particular, the present invention relates to computer network applications that permit information processing systems to process, transmit and display data. Still more particularly, the present invention relates to network browser applications. Still more particularly, the present invention relates to an improved method and system for indexing and displaying network documents.
2. Description of the Related Art
The development of computerized distributed information resources, such as the "Internet," allows users to link with servers and networks, and thus retrieve vast amounts of electronic information heretofore unavailable in an electronic medium. Such electronic information increasingly is displacing more conventional techniques of information transmission, such as newspapers, magazines, and even television. The term "Internet" is an abbreviation for "Internetwork," and refers commonly to a collection of computer networks that utilize the TCP/IP suite of protocols, well-known in the art of computer networking. TCP/IP is an acronym for "Transport Control Protocol/Interface Program," a software protocol developed by the Department of Defense for communication between computers.
Electronic information transferred between computer networks (e.g., the Internet) can be presented to a user in hypertext, a metaphor for presenting information in a manner in which text, images, sounds, and actions become linked together in a complex nonsequential web of associations that permit the user to "browse" through related topics, regardless of the presented order of the topics. These links are often established by both the author of a hypertext document and by the user, depending on the intent of the hypertext document. For example, traveling among hypertext links to the word "iron" in an article displayed within a graphical user interface in a computer system might lead the user to the periodic table of the chemical elements (i.e., linked by the word "iron"), or to a reference to the use of iron in weapons in Europe in the Dark Ages. The term "hypertext" is utilized to describe documents, as presented by a computer, that express the nonlinear structure of ideas, as opposed to the linear format of books, film, and speech.
Networked systems typically follow a client/server architecture. A "client" is a member of a class or group that utilizes the services of another class or group to which it is not related. In the context of a computer network such as the Internet, a client is a process (i.e., roughly a program or task) that requests a service provided by another program. The client process utilizes the requested service without having to "know" any working details about the other program or the service itself. In networked systems, a client is usually a computer that accesses shared network resources provided by another computer (i.e., a server).
A "server" is typically a remote computer system accessible over a communications medium such as the Internet. The server scans and searches for raw (e.g., unprocessed) information sources (e.g., newswire feeds or newsgroups). Based upon such requests by the user, the server presents filtered electronic information to the user as server responses to the client process. The client process may be active in a first computer system, and the server process may be active in a second computer system, and communicate with one another over a communications medium that allows multiple clients to take advantage of the information-gathering capabilities of the server.
Client and server communicate with one another utilizing the functionality provided by a hypertext transfer protocol (HTTP). The World Wide Web (WWW) or, simply, the "web," includes all servers adhering to this protocol, which are accessible to clients via a Universal Resource Locator (URL). Internet services can be accessed by specifying Universal Resource Locators that have two basic components: a protocol to be used and an object pathname. For example, the Universal Resource Locator address, "http://www.uspto.gov" (i.e., the "home page" for the U.S. Patent and Trademark Office), specifies a hypertext transfer protocol ("http") and a pathname of the server ("www.uspto.gov"). The server name is associated with a unique numeric value (TCP/IP address). Active within the client is a first process, known as a "browser," that establishes the connection with the server and presents information to the user. The server itself executes corresponding server software that presents information to the client in the form of HTTP responses. The HTTP responses correspond to "web pages" constructed from a Hypertext Markup Language (HTML), or other server-generated data. Such "web pages" are also referred to as web or network documents.
A "web page" (also referred to by some designers simply as a "page") is a data file written in a hyper-text language that may have text, graphic images, and even multimedia objects such as sound recordings or moving video clips associated with that data file. The web page can be displayed as a viewable object within a computer system. A viewable object can contain one or more components such as spreadsheets, text, hotlinks, pictures, sound, and video objects. A web page can be constructed by loading one or more separate files into an active directory or file structure that is then displayed as a viewable object within a graphical user interface.
When a client workstation sends a request to a server for a web page, the server first transmits (at least partially) the main hypertext file associated with the web page, and then loads, either sequentially or simultaneously, the other files associated with the web page. A given file may be transmitted as several separate pieces via TCP/IP protocol. The constructed web page is then displayed as a viewable object on the workstation monitor. A web page may be "larger" than the physical size of the monitor screen, and devices such as graphical user interface scroll bars can be utilized by the viewing software (i.e., the browser) to view different portions of the web page.
A problem associated with network documents, such as web pages or web documents, is the inability of a user to uniformly locate identical locations within multiple copies of the same documents. Reviewing large documents "published" or displayed on the "web" is difficult because the display formats are dependent upon the type of browser being utilized, the display fonts available, and the window sizing on the display screen itself. Printing the document itself is not an alternative solution, because web documents printed as hardcopy text depend on the printer, the browser and the formatting. Thus, is impossible to refer to content based on a web page number or line number associated with a particular network document. A common solution to such problems is to include numbered header and document sections. However, such numbering and sectioning is limited in functionality and extensibility because the sections can stretch over several screens or pages, thus making it difficult for a user to determine which specific section of the document the user is actually reading or interpreting. Based on the foregoing, it can be appreciated that a need exists for a method and system which would allow users to view the same copy of a network or web document in varying display formats and be assured that they are viewing the same position in the document. An automated method and system for implementing such a feature would also be desirable, given the complexity associated with constructing and display network documents.