The invention relates generally to computers and the world wide web, and deals more specifically with a server which identifies new web pages of interest to a user.
The world wide web (WWW) comprises a multitude of computer servers, respective databases and a network by which clients can communicate with the servers and request and load the data. Each of the clients includes a "web browser" which is an interface to the user and to the WWW. A server may directly manage its own database and access other, remote databases on behalf of a client user.
The server presents the data to the client as "web pages" and each web page has a "URL" address which comprises an access method/protocol such as http as a prefix, a server name and the requested data as a suffix. The server name typically includes a "domain name" which is the name of a company, educational institution or other organization that owns the server. The request indicates a web page associated with the server. There are different ways that a client can access a web page. If the client knows the URL, the client can directly request the web page from the server. However, if the client only knows the server name, the client can address the server name and in response, the server will present the "home page" for the server. The home page (and other web pages) typically includes tags or "hot links" which reference the associated web pages. When the user selects a hot link, the web browser requests the respective web page from the corresponding server.
Today, there is a vast amount of data on the WWW. Many servers and thousands of web pages are added daily. This presents a problem in identifying the web pages of interest to a user. For those web pages for which the user does not know at least the server name, there are different techniques to "surf the net", i.e. identify the server and/or web page. There are different types of search engines that are currently available. A Yahoo (tm) server includes a catalog search engine and Alta Vista (tm) and Web Crawler (tm) servers include key word search engines.
Key word search engines function by searching various databases for key words (with logical connectors) specified by the user. For example, if a user is interested in golf, the client can enter the word "golf" as a search word. Then, the search engine identifies those web pages which include the word golf. Then, the web browser typically presents a list of the identified web pages to the client. This list includes a title of each web page and a hot link to actually access the respective web page. If the client selects the hot link, then the server sends the respective web page to the client. While key word searches are easy to initiate, they tend to yield a large number of web pages which are not of interest to the client.
Catalog servers initially define a hierarchy of subject categories or topics as illustrated in FIG. 1. Most of the categories also include hot links to web pages relating to the subject of the respective category. The client then "navigates" down through the hierarchy to identify a category of interest and then can select a hot link to obtain a web page containing actual data within the same subject. The catalog server obtains its web pages from other databases or human editors, and periodically adds them to the appropriate category in the hierarchy.
One such type of catalog server is provided by Yahoo Corporation. The Yahoo (tm) server presents a list of several broad categories 42 such as recreation, arts, business, science, education etc. which are at the top of the hierarchy and encompass all the data within the Yahoo database. The client can select one of these broad categories of interest such as recreation. In response, Yahoo presents several subcategories of recreation such as aviation, animals, amusements, games and sports. Then, the client selects one of the subcategories such as games, and in response, Yahoo presents several sub-subcategories such as air hockey, billiards etc. Each category (top, sub, sub--sub, etc.) in the hierarchy includes hot links for associated web pages. Naturally, as the hierarchy is traversed further and further downwardly, the associated web pages become more and more specific. The disadvantage of a hierarchical search is the time and skill required to navigate through the hierarchy, but this is compensated by the relevance of the ultimate web page selections.
Various program tools are available today to facilitate the process of identifying new data of interest to a user. For example, an IBM Web Browser Intelligence tool is executed on the client computer and keeps track of which URLs/web pages the user previously accessed. The user can request to see any recent changes to these URLs/web pages. In response, the tool monitors these particular URLs for changes and notifies the user.
While the foregoing tools are effective in identifying some new data of interest to the user, further improvements are desirable to identify other new data of interest to the user.
Accordingly, a general object of the present invention is to provide a computer system which identifies additional and more relevant data of interest to the user than the foregoing tools.