The World Wide Web (WWW) is comprised of an expansive network of interconnected computers upon which businesses, governments, groups, and individuals throughout the world maintain inter-linked computer files known as web pages. Users navigate these pages by means of computer software programs commonly known as Internet browsers. Due to the vast number of WWW sites, many web pages have a redundancy of information or share a strong likeness in either function or title. The vastness of the unstructured WWW causes users to rely primarily on Internet search engines to retrieve information or to locate businesses. These search engines use various means to determine the relevance of a user-defined search to the information retrieved.
The authors of web pages provide information known as metadata, within the body of the hypertext markup language (HTML) document that defines the web pages. A computer software product known as a web crawler, systematically accesses web pages by sequentially following hypertext links from page to page. The crawler indexes the pages for use by the search engines using information about a web page as provided by its address or Universal Resource Locator (URL), metadata, and other criteria found within the page. The crawler is run periodically to update previously stored data and to append information about newly created web pages. The information compiled by the crawler is stored in a metadata repository or database. The search engines search this repository to identify matches for the user-defined search rather than attempt to find matches in real time.
A typical search engine has an interface with a search window where the user enters an alphanumeric search expression or keywords. The search engine sifts through available web sites for the user's search terms, and returns the search of results in the form of HTML pages. Each search result includes a list of individual entries that have been identified by the search engine as satisfying the user's search expression. Each entry or “hit” includes a hyperlink that points to a Uniform Resource Locator (URL) location or web page.
In addition to the hyperlink, certain search result pages include a short summary or abstract that describes the content of the URL location. Typically, search engines generate this abstract from the file at the URL, and only provide acceptable results for URLs that point to HTML format documents. For URLs that point to HTML documents or web pages, a typical abstract includes a combination of values selected from HTML tags. These values may include a text from the web page's “title” tag, from what are referred to as “annotations” or “meta tag values” such as “description,” “keywords,” etc., from “heading” tag values (e.g., H1, H2 tags), or from some combination of the content of these tags.
With its links to computers and computer networks throughout the world, the Internet offers nearly limitless access to information. Much of the information is public and is available to all users. Other information is private and access must be limited. However, the same interconnectivity that makes information so readily available places a special burden on those systems involved in the exchange or storage of private information. This security aspect is of particular importance in the face of widespread hacking, i.e., using computers to gain unauthorized access to other computer systems and to actively steal, destroy or otherwise corrupt electronic information. Tight security is also warranted in the case of multi-user facilities where a single computer terminal is accessible to many clients.
As an example, businesses that ply their trade on the Internet (ebusinesses), must rely on client-server interchanges over the Internet rather than more traditional face-to-face or voice interactions. This interchange between the client and the server, occurring between client log-on and log-off, may be viewed as a business transaction, with intrinsic benefits and risks to both the ebusiness and the client. To minimize the risk and maximize the benefits associated with the transaction, the information exchanged between the client and ebusiness server must remain secure. In particular, the ebusiness must implement secure user log-on and log-off facility for the exchange of this non-public information.
In an ebusiness transaction, users seeking to access private information typically begin their transaction by first logging into a standard log-in facility. At this point they can access the secure information by providing a password or other information to the ebusiness server that identifies them as having legitimate access to given information. Ideally, the client would exchange information with the ebusiness and then log off expressly, ending the secure connection. In reality, there may be periods when the client is completely inactive but remains connected, perhaps while distracted. There may be other times when the client chooses to access another web site that is not secure. In the case of a multi-user facility such as a kiosk, the client may inadvertently leave without ending their session by logging off. In each of these cases the results are the same:                1. The client remains connected to the site even if not actively using it.        2. The client becomes prone to the theft or corruption of electronic information.        3. The ebusiness expends valuable resources maintaining a secure connection that is either under-utilized or un-utilized.        4. If the user goes to another site and then shortly thereafter returns back to the secure site, the user might not be able to reconnect before the previous session has expired or timed out.        
Periods of inactivity are unavoidable but represent a real threat to the security of the transaction. The difficulty in solving the problem lies in determining how and when a non-uniform and largely unpredictable secure session should be terminated.
The problem is further complicated by the structure of the Internet and the World Wide Web. The Web's Hypertext Transfer Protocol (HTTP) is stateless, meaning that all requests for information are equivalent. No information about the client is stored during previous or even current sessions. This leaves the servers with no intrinsic information about clients or the information they have requested.
Ebusinesses have attempted to mitigate and even solve this problem primarily though the use of cookies. Other implementations include embedding user information in a hidden location, or using CORBA/IIOP and JAVA RMI. Cookies are information placed on the hard disk of the client by the server to identify the user and store pertinent information about them. Typically, cookies are given a finite lifetime. In the context of secure Internet transactions, these cookies are used to log off the client after a specified period of inactivity, and represent a second primary type of log-off, with the first being the user-initiated express log-off.
The cookie solution for terminating sessions has several shortcomings. The fixed length of time for inactivity may result in log-offs before the client is ready to end the session. In the event that the user fails to expressly log off, the continued session may result in unauthorized access by other users. This may also extend to hackers who may take advantage of the lengthy connection times to gain access to private information.
There is currently no adequate mechanism by which inactive, secure Internet sessions can be terminated in an optimal way. The use of cookies is self-limiting and inflexible, treating all users in the same manner. There is currently no means of detecting a situation where a user may be endangering a secure transaction or private information by selecting a non-secure website while logged into a secure site. The need for such a mechanism has heretofore remained unsatisfied.