1. Field of the Invention
This invention relates to the technology of web browsers and servers, and especially to the web browsing technologies which provide browsing capabilities prioritized and keyed to a user's personal interests.
2. Description of the Related Art
The Internet and the World Wide Web have become critical, integral parts of commercial operations, personal lives, and the education process. At the heart of Internet is web browser technology and Internet server technology.
An Internet server contains “content” such as documents, image or graphics files, forms, audio clips, etc., all of which is available to systems and browsers which have Internet connectivity.
Web browser or “client” computers may request documents from web addresses, to which appropriate web servers respond by transmitting one or more web documents, image or graphics files, forms, audio clips, etc. The most common protocol for transmission of web documents and contents from servers to browsers is Hyper Text Transmission Protocol (“HTTP”).
FIG. 1 shows the fundamental client-server arrangement of Internet and intranet communications. A client browser computer (1) is provided with Internet access (2) to the World Wide Web (3) through common means such as a dial-up telephone line and modem, cable modem, or local area network (“LAN”). The web browser computer (1) is also provided with appropriate web browsing software, such as Netscape's Navigator or Microsoft's Explorer. A web server computer (5) is likewise provided with Internet access (4) to the World Wide Web (3) using similar means, or higher-bandwidth means such as T1 and T3 data lines, and a web server suite of software. Alternatively, client and servers may be interconnected via an Intranet (6), such as a corporate LAN. These arrangements are well known within the art.
The most common type of Internet content or document is Hyper Text Markup Language (“HTML”) documents, but other formats are equally well known in the art, such as Adobe Portable Document Format (“PDF”). HTML, PDF and other web documents provide “hyperlinks” within the document, which allow a user to select another document or web site to view. Hyperlinks are specially marked text or areas in the document which when selected by the user commands the browser software to retrieve or fetch the indicated document.
Ordinarily, when the user selects a plain hyperlink the current page being displayed in the web browser's graphical user interface (“GUI”) window disappears and the newly received page is displayed. If the parent page is an index, for example the IBM web site www.patents.ibm.com, and the user wishes to visit each descending link (e.g. read the document with tips on how to use the site), then the parent or index page disappears and the new page is displayed (such as the help page).
As the computing capacity of web browser computers increase and the communications bandwidth to the web browser computer increase dramatically, one challenge for organizations who provide Internet web sites and content is to deliver and filter such content in anticipation of these greater processing and throughput speeds.
This is particularly true in the realm of web-based applications, and in the development of better and more efficient ways to move user-pertinent information to the desktop or client.
However, today's web browsers are in general unintelligent software packages. As they currently exist, they require the user to manually search for any articles or documents of interest to him or her, and are often cumbersome in that they frequently require download of many documents before one of germane interest is found.
Search engines provide some level of “intelligence” to the browsing experience, wherein a user may point his unintelligent web browser to a search engine address, enter some keywords for a search, and then review each of the returned documents one at a time by selecting hyperlinks in the search results, or by re-pointing the web browser manually to provided web addresses. However, search engines do not really search the entire Internet, rather they search their own indices of Internet content which has been built by the search engine operator, usually through a process of reviewing manual submissions from other web site operators. Thus, it is common for a user to use several search engines while looking for information on a particular subject, because each search engine will return different results based on their own index content.
To partially address this problem, two other technologies have been developed and are well-known in the art. The first technology is known as a “metasearch engine” which is a search engine of search engines. A metasearch engine does not keep its own index, but rather submits a query to multiple search engines simultaneously, and returns to the user the highest ranked returns from each of the search engines. While this is more useful than manually serially visiting each of the queried search engines, the results are typically less satisfying than would be expected. Commonly, the top few returns on a list of ranked matches to the search keywords are not the most interesting, and so more often than not, a user visits the sites listed towards the middle or end of the return list. The metasearch engine may, though, return the top 5 of listings from 4 search engines, which may filter out the more likely interesting information.
The second attempt at solving this problem is known as web “crawler” engines. These servers periodically contact other servers to “re-index” previously indexed web site content, which tends to keep them more up-to-date and incorporates into their index any newly available information a web site. However, since thousands of new web sites are brought on-line each day, it is practically impossible for a crawler to visit new sites. So, even web crawlers may not provide full coverage of internet content.
Other attempts, including creating a “community of intelligent agents”, use of server-based interactive sorting and filtering, a client-side “intelligent assistant” triggered by encountering special tags within a web document, and automatic “bookmark” functions, have been proposed in various U.S. patents. The related application provides a discussion of these technologies and methods. In general, all of these proposed technologies and methods require some amount of server-side and client-side cooperation, making it difficult to deploy these technologies on a wide scale.
Several years ago, client-side technology was introduced to download all web pages within one hyperlink of the web page currently loaded by the browser. By gathering all the directly linked documents from the currently visited page, whichever one the user next selected would be immediately available from a cache in local memory, thereby eliminating the wait for the newly selected page to be transmitted from the server to the web browser. By the time the user finished reading the next page (now the current page) and selected a subsequent document, the subsequent document had already been cached so that it, too, could be displayed without transmission delay. However, this process has shortcomings when visiting a “link-rich” web page. For example, a web page of a popular news site may have over 60 directly linked documents from the new service's home page. Thus, the communications network serving the web browser computer may pose a bottleneck or time-limiting factor for loading all 60 directly linked documents while the user reads the home page, and before the user selects a hyperlink on the home page. As such, only a few of these directly linked pages might be successfully downloaded in the time that it takes the reader to peruse the home page and to make a decision on the next document to view. Unfortunately, the pages that were successfully downloaded during the review of the home page may be of no interest to the user as the downloading function has no means for sorting or determining which pages may or may not be of interest.
The related application disclosed a system and method for configuring a web browser system to include a list of interest terms for a user. This method provided a list of the user's most sought-after keywords, the list being available to other software programs on the same client web browser computer.
Therefore, there is a need in the art for a web browsing method and system which predictively retrieves information from computer network servers and distributed databases, such as the World Wide Web, based upon a user's list of interest terms or keywords. Further, there is a need in the art for this new system and method to be compatible with widely-used web browser technologies, such as personal computers, web-enabled telephones, Internet appliances, personal digital assistants, and pocket PCs, with minimal or no server-side support or cooperating technology. Additionally, there is a need in the art for a system and method to highlight predictively cached information, or links to such information, on a user's display such that the user may easily and quickly view the predictively cached information.