Electronically published documents are increasingly being made available using a general markup language. A markup language provides indications of structure of the document, but excludes streams of graphic display instructions, which are typically found in formatted documents. Markup languages are more portable between a variety of different machines that may use different graphic display commands. A commonly used markup language is the Standardized General Markup Language (SGML), an ISO standard.
Client-server computer systems for electronically publishing documents have also become increasingly available. Such a system typically includes one computer system (the server) on which documents are stored so that other computer systems (the clients) can access the information. The server and client communicate via messages conforming to a communication protocol sent over a communication channel such as a computer network. The server responds to messages from clients and processes requests to transmit a requested document.
An example of a client-server computer system for retrieval of electronically published documents that use a markup language is the World Wide Web (WWW or “web”) on the Internet. The WWW is a “web” of interconnected documents that are located in various sites on the Internet. Documents that are published on the WWW are written in the Hypertext Markup Language (HTML) or Extensible Markup Language (XML). HTML documents stored as such are generally static, that is, the contents do not change over time unless the publisher modifies the document.
HTML is a markup language used for writing hypertext documents. HTML documents are SGML documents that conform to a particular Document Type Definition (DTD). An HTML document includes a hierarchical set of markup elements, where most elements have a start tag, followed by content, followed by an end tag. The content is a combination of text and nested markup elements. Tags are enclosed in angle brackets (‘<’ and ‘>’) and indicate how the document is structured and how to display the document, as well as destinations and labels for hypertext links. There are tags for markup elements such as titles, headers, text attributes such as bold and italic, lists, paragraph boundaries, links to other documents or other parts of the same document, in-line graphic images, and many other features. While there are differences between HTML, SGML, and XML languages, in this document, the term “HTML” will be used in a generic sense to refer to all markup-language-based documents, whether or not they are specifically written in HTML.
An Internet site which electronically publishes documents on the WWW is called a “Web site” and runs a “Web server,” which is a computer program that allows a computer on the network to make documents available via the WWW. The documents are often hypertext documents in the HTML or XML language, but may be other types of documents.
A user (typically using a machine other than the machine used by the Web server) accesses documents published on the WWW by using a client program called a “Web browser.” The Web browser allows the user to retrieve and display documents from Web servers. Two of the most popular Web browser programs are the Navigator browser from Netscape Communications, Corp., of Mountain View, Calif. and the Internet Explorer browser from Microsoft Corp. of Redmond, Wash.
The Web server and the Web browser communicate using the Hypertext Transfer Protocol (HTTP) message protocol and the underlying TCP/IP data transport protocol of the Internet. HTTP is described in Hypertext Transfer Protocol-HTTP/1.0 by T. Berners-Lee, R. T. Fielding, H. Frystyk Nielsen, Internet Draft Document, Dec. 19, 1994, and is currently in the standardization process. In HTTP, the Web browser establishes a connection to a Web server and sends an HTTP request message to the server. In response to an HTTP request message, the Web server checks for authorization, performs any requested action and returns an HTTP response message containing an HTML document resulting from the requested action, or an error message. For instance, to retrieve a static document, a Web browser sends an HTTP request message to the indicated Web server, requesting a document by its URL. The Web server then retrieves the document and returns it in an HTTP response message to the Web browser. If the document has hypertext links, then the user may again select a link to request that a new document be retrieved and displayed. As another example, if a user completes in a form requesting a database search, the Web browser sends an HTTP request message to the Web server including the name of the database to be searched and the search parameters and the URL of the search script. The Web server calls a program or script, passing in the search parameters. The program examines the parameters and attempts to answer the query, perhaps by sending a query to a database interface. When the program receives the results of the query, it constructs an HTML document that is returned to the Web server, which then sends it to the Web browser in an HTTP response message.
The terms “applet” and “servlet” are established terms in the Java programming language art and will be used herein, since the terms have meaning to those skilled in this art. “Applet” refers to an independent software module that runs within a Java-enabled web browser. Servlet refers to a software module that resides on a Java enabled web server. It is to be understood that the use of the terms “applet” and “servlet” herein is not intended to limit the invention in any way. For clarification, the phrase “configuration applet” is used herein to refer to a software module used to configure preferences for an end user software application such as a word processor, a database manager, etc. Since software applications are also “applets” in the Java environment, the phrase “user applet” or just “applet” is used herein to refer to an end user application. It should be understood, however, that within the context of this application, “applet” and “servlet” are used in a generic sense, and are not limited to Java-based systems or environments.
Common Gateway Interface (CGI) programs are an important part of the HTTP server function. CGI is a World Wide Web standard for extending HTML functionality. CGI processing typically involves the combination of a live Web server and external programming scripts or executables. In particular, CGI programs are typically used to return dynamic information and to respond to HTTP browser input in HTML forms.
While markup-language-based documents were originally developed for easy and universal access to documents over the web, currently many standalone computer applications and other documents are written to interact with the user in HTML, so that any user can access them using a standard web browser. As such, it has become increasingly important to provide the user with a robust and intelligent interaction via his web browser. To do so, the server system, whether remote or embodied in a local application, must be able to manage the user's navigation and use of the HTML documents in context.
The content and behavior of a web page within a complex application may depend upon a variety of contextual factors, including the privileges of the user, the software products installed, the sequence of pages shown previously in the browser window (i.e. page flow), and the data entered on these pages. A page flow dependency may extend multiple pages in the page flow. If a user is working in multiple browser windows, each has its own independent page flow, and therefore its own context. In addition, web browsers provide mechanisms such as “Back” and “Forward” buttons and menus of recently visited pages that allow a user to display a different page without interacting with the web server providing those pages. The page from which a new request is made is an important part of the page flow, and any mechanism for tracking the page flow context must function correctly even when these browser features are used.
Since a web server may concurrently serve a large number of users, scalability is also a concern. The amount of contextual data stored for each user must not grow too large. Since a user can in principle return at any time to a page previously displayed within the session, it is generally not safe to dispose of any contextual data until the session is ended. Contextual data could be written to disk, but this increases the complexity of the system and could affect performance.
Although servlets provide storage locations with a variety of scopes (page, request, session, and application), none of these is satisfactory by itself for page flow contextual data. The lifetime of this contextual data is longer than a single request. Storing it on the session threatens scalability since a page flow context must be stored for each page that has been sent to the user. Even when this is tolerable it is still necessary to identify which page flow context to use for any given request.
There is, therefore, a need in the art for a system and method for the unified management of contextual information for a user interaction in an HTML interface.