The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
The World Wide Web includes a network of servers on the Internet, each of which hosts one or more HTML (Hypertext Markup Language) pages, also called “web pages.” The HTML pages associated with a server provide information and hypertext links to other documents on that and other servers. The links include a Universal Resource Locator (URL) that is used to determine the location of web pages and other resources on the network.
Web servers communicate with clients by using the Hypertext Transfer Protocol (HTTP). The web servers listen for requests from clients for their HTML pages and respond by sending the requested pages to the clients. Servers may respond to browser requests by sending a static web page or by performing dynamic operations. For example, a server may respond to a request by issuing a query to a database, dynamically constructing a web page containing the results of the query, and transmitting the dynamically constructed HTML page to the requesting browser.
Users of the World Wide Web use a client program, such as a browser, to request, decode and display information from web servers. When the user of a browser selects a link on an HTML page, the browser that is displaying the page sends a request over the Internet to the server associated with the URL specified in the link. In response to the request, the server transmits the requested information to the browser that issued the request. The browser receives the information, presents the received information to the user, and awaits the next user action to initiate a new request.
In an attempt to tailor the web user's experience to match the web user's interests, providers of web sites use one or more servers to provide an array of web pages that may be navigated by the user in an order according to the interests of the user.
HTTP, the protocol used to send web pages to clients, is a stateless protocol by design. That is, each message sent using HTTP is independent of other messages sent before or after in time. This is an advantage in many circumstances. For example, it simplifies the programming of the browser, which need not keep and manage collections of prior pages. It also reduces the computational demands on the device that hosts the browser. For example, the device need not have a large amount of storage, and the device need not consume time on its processor associating a received page with prior pages.
However, there are some disadvantages to a stateless protocol. For example, a completely stateless protocol does not support a complex transaction that obtains information from a user and generates data over several different web pages. The information generated by the server and sent to the client in response to one request is not necessarily conveyed to the server in a second request from the same user using the same browser. To circumvent the stateless nature of HTTP, web browsers have been extended in some approaches to include some information received from a web server in subsequent requests to the same or other servers.
In one approach, called “URL encoding,” the URL field in a link is extended to include characters not used to specify a particular web page. The server provides the data in the URL field, including the formal URL that does specify a web page and any extra characters. The extended URL with the extra characters is included in any user request directed to the web page identified by the formal URL within the extended URL field. The extra characters may be used to record state information that will be supplied to the server that responds to requests for the web page.
A problem with this approach is that the URL field is limited to a particular number of characters, called hereinafter the URL field character limit. For example, in some implementations of this approach, the URL field character limit is 1024 characters. The URL field character limit may prevent a substantial amount of state information involved in more complex transactions from being included in the extended URL field.
In another approach, the browser writes a file, often called a “cookie,” in client storage in response to receiving a web page that indicates a cookie should be written. The browser associates the cookie with the server that sent the web page. The web page sent by the server indicates the information to be placed in the cookie. The next time the browser sends a request to the same server, the contents of the cookie are included.
A problem with this approach is that each cookie written by the browser for a particular server overwrites any previous cookie for that server. Thus, the web server cannot cause the browser to add information to an existing cookie while retaining other information already in the cookie. The server must accumulate the information in the cookie and send back the combined information for the new cookie. This causes the same information to be passed back and forth over the network, consuming valuable network bandwidth for redundant information. Furthermore, cookies don't differentiate between different browser windows.
In conventional approaches using URL-encoding and cookies, a session object is formed on a host on which the server is executing. The session object stores and manages the state information returned by the browser. The session object on the server host is associated with the browser, as identified by a network address for the host on which the browser is executing.
A problem arises when the user goes back to a previously visited web page and branches. Depending on the application supported by the server, the state information that applies at that previously visited page may be different from the state information that applies if the user proceeds forward to the same web page. For example, a user who has made two purchases using two different credit cards visits the credit card input page twice. If the user hits the back button on the browser and returns to the first credit card input page, the user would like to see the credit card information for the first credit card. That information might be overwritten in the URL field or cookie by the credit card information that was input for the second purchase. The user would have to regenerate the first information, which is tedious, unnecessary and undesirable.
Even if the session object retains information input for both credit cards, the information for the second credit card is not applicable at the back-visited page. Both visits are recorded in the same session object because both come from the same browser. The server often has no basis in the session object to distinguish the two visits. Both visits are in response to requests for the same web page from the same browser.
A problem also arises when a user executes several instances of a browser on the same device. Each browser instance opens a different display area (called a “window”) on the user's display device. Each browser instance has the same network address and therefore appears to the server as if it belongs to the same session object as other browser instances on the same host. If the same web page is opened by the two instances, the two windows are indistinguishable on the server side. The cookie written for the second visit is likely to overwrite the cookie written for the first visit.
Based on the foregoing, there is a clear need for techniques that manage state information at a server, which are robust in the presence of long navigational histories, backwards navigation and branching, and navigation in multiple browser windows during a communications session with a client.