1. Technical Field:
The present invention relates to information retrieval in data processing systems. In particular, the present invention relates to data processing systems which are linked to other data processing systems by an associated linking network. More particularly, the present invention relates to associated networks which utilize mark-up languages. Still more particularly, the present invention relates to a method and system for rendering hyper-text documents in a printable medium while retaining hyper-link information.
2. Description of the Related Art:
The development of computerized information resources, such as the "Internet" and the proliferation of "web" browsers allow users of data processing systems to link with other servers and networks, and thus retrieve vast amounts of electronic information heretofore unavailable in an electronic medium. Such electronic information is increasingly displacing more conventional means of information transmission, such as newspapers, magazines, and even, television. In communications, a set of computer networks which are possibly dissimilar from one another are joined together by "gateways" that handle data transfer and the conversion of messages from the sending network to the protocols used by the receiving network, with packets if necessary. A gateway is a device used to connect dissimilar networks (i.e., networks utilizing different communication protocols) so that electronic information can be passed from one network to the other. Gateways transfer electronic information, converting such information to a form compatible with the protocols used by the second network for transport and delivery. The term "internet" is an abbreviation for "internetwork," and refers commonly to the collection of networks and gateways that utilize the TCP/IP suite of protocols, which are well-known in the art of computer networking. TCP/IP is an acronym for "Transport Control Protocol/interface Program," a software protocol developed by the Department of Defense for communication between computers.
Electronic information transferred between data processing networks is usually presented in hyper-text, a metaphor for presenting information in a manner in which text, images, sounds, and actions become linked together in a complex non-sequential web of associations that permit the user to "browse" through related topics, regardless of the presented order of the topics. These links are often established by both the author of a hyper-text document and by the user, depending on the intent of the hyper-text document. For example, traveling among links to the word "iron" in an article displayed within a graphical user interface in a data processing system might lead the user to the periodic table of the chemical elements (i.e., linked by the word "iron"), or to a reference to the use of iron in weapons in Europe in the Dark Ages. The term "hyper-text" was coined in the 1960s to describe documents, as presented by a computer, that express the nonlinear structure of ideas, as opposed to the linear format of books, film, and speech.
The term "hyper-media," on the other hand, more recently introduced, is nearly synonymous with "hyper-text" but focuses on the nontextual components of hyper-text, such as animation, recorded sound, and video. Hyper-media is the integration of graphics, sound, video, or any combination into a primarily associative system of information storage and retrieval. Hyper-media, as well as hyper-text, especially in an interactive format where choices are controlled by the user, is structured around the idea of offering a working and learning environment that parallels human thinking--that is, an environment that allows the user to make associations between topics rather than move sequentially from one to the next, as in an alphabetic list. Hyper-media, as well as hyper-text topics, are thus linked in a manner that allows the user to jump from one subject to other related subjects during a search for information. Hyper-link information, such as "World Wide Web" address sites, are contained within hyper-media and hyper-text documents, which allow a user to go back to the "original" or referring Web site by the mere "click" (i.e., with a mouse or other pointing device) of the hyper-linked topic.
A typical networked system which utilizes hyper-text and hyper-media conventions follows a client/server architecture. The "client" is a member of a class or group that uses the services of another class or group to which it is not related. Thus, in computing, a client is a process (i.e., roughly a program or task) that requests a service provided by another program. The client process utilizes the requested service without having to "know" any working details about the other program or the service itself. Thus, in a client/server architecture, particularly a networked system, a client is usually a computer that accesses shared network resources provided by another computer (i.e., a server).
In such a client/server architecture, a request by a user for news can be sent by a client application program to a server. Such a server is typically a remote computer system accessible over the Internet or other communication medium. The server scans and searches for raw (e.g., unprocessed) information sources (e.g., newswire feeds or newsgroups). Based upon such requests by the user, the server presents filtered electronic information as server responses to the client process. The client process may be active in a first computer system, and the server process may be active in a second computer system, and communicate with one another over a communication medium, thus providing distributed functionality and allowing multiple clients to take advantage of the information-gathering capabilities of the server.
Client and server can communicate with one another utilizing the functionality provided by Hyper-Text Transfer Protocol (HTTP). The World Wide Web (WWW) or, simply, the web, includes all the servers adhering to this standard which are accessible to clients via Uniform Resource Locators (URLs). For example, communication can be provided over a communication medium. In particular, the client and server may be coupled to one another via Serial Line Internet Protocol (SLIP) or TCP/IP connections for high-capacity communication. Active within the client is a first process, known as a "browser," which establishes the connection with the server and presents information to the user. The server itself executes corresponding server software which presents information to the client in the form of HTTP responses. The HTTP responses correspond to web "pages" constructed from a Hyper-Text Markup Language (HTML), or other server-generated data.
The client and server typically display browsers and other internet data for a user via a graphical user interface. A graphical user interface is a type of display format that enables a user to choose commands, start programs, and see lists of files and other options by pointing to pictorial representations (icons) and lists of menu items on the screen. Choices can generally be activated either with a keyboard or a mouse.
Sometimes a user desires to print a hardcopy of a document provided in hyper-text format with hyper-link information (i.e., links to other documents and Web sites). Hyper-Text Markup Language (HTML) is typically utilized to create such documents. However, much of the document's usefulness is lost when it is printed as hardcopy. The difference between a screen-rendered hyper-text document and the same document printed as hardcopy is that the hyper-link information in the hardcopy no longer has a graphical user interface visual cue or an application function to "jump" to a linked web page or web site. In fact, a user cannot ascertain, based on the rendered hardcopy document printout, that a hyper-link existed in the screen-displayed document or to which site the document was linked. A need thus exists for allowing users to ascertain, based on any document rendered as hardcopy, such hyper-link information.