This invention relates to an information filtering device which selects the pieces of information that conform to a search condition from a plurality of distributed pieces of information and provides users with the selected pieces of information.
The present invention relates to an information filtering method of selecting the pieces of information that conform to a search condition from a plurality of distributed pieces of information and providing users with the selected pieces of information.
Furthermore, the present invention relates to an information recording medium in which a program capable of effecting such information filtering has been stored.
In recent years, the Internet has been popularized remarkably. Users can access the information stored in computers scattered all over the world, provided that the computers have been connected to the Internet.
With WWW (World Wide Web), use of HTTP enables users to easily access pieces of information dispersed all over the world with the help of GUI (Graphical User Interface)-based browsers.
In WWW, a software program called httpd is used on a particular computer. At the request of another computer, the program transfers hypertext files written in HTML (HyperText Markup Language) stored in the database of the computer.
Computers connected to the Internet can read the specified file by giving the address of the hypertext file to the httpd running on a computer that has the hypertext file to be transferred.
Since in the HTML description, the address is written as link information in the hypertext file, browsers complying with the protocol HTTP can display hypertext files under the rule of each httpd.
When browsers are provided with the function of outputting various type of data, including sound, still pictures, and moving pictures, they can display hypertexts including multimedia data items.
The mechanism of WWW enables users to access pieces of information scattered over the Internet easily, which promotes a lot of individuals and companies to open their hypertext files (or Web pages) to the public.
WWW has no supervisor of Internet databases. Users create and modify their Web pages whenever they like. Since the number of Web pages is very large (the number of Web pages open to the public all over the world at the beginning of 1996 was estimated at 40,000,000), it is difficult for users to find where their desired Web pages are (or to determine what URL addresses to specify to acquire the necessary Web pages).
To cope with the problem, a system for searching for accessible Web pages on a content basis has recently been developed and Web page searching services have been available.
With Web search servers, users can search for Web pages including a specific keyword by specifying the keyword. The user searches for the necessary Web pages using the Web search servers.
Although use of such Web search servers enables users to make an on-line search for the necessary information easily, this is limited only to the case where the user has specified the necessary information for search.
Specifically, when the user has not given a search instruction even if the information that the user is interested in has been created newly, the user will not know the existence of the information, no matter how important the information is.
Therefore, a system that notifies a relevant user of the existence of interesting information is needed. In conventional database systems, such a function is called SDI (Selective Disseminative Information).
With SDI, users register the keywords to select pieces of interesting information in the system beforehand as personal profiles.
When a new data item has been registered, the system compares the data item with the keyword (profile). If the data item coincides with the keyword, the system will inform the user who has registered the profile that the desired information has been produced newly.
With such a conventional database, since individual data items exist in a local environment or are supervised by a specific database supervisor, it is easy to distinguish the newly produced data item from the existing data items.
With WWW, however, users each can register their own Web pages and there is no supervisor who controls the entire WWW. It is therefore very difficult to distinguish the new data item from the existing data items. To solve this problem, various information filtering devices have recently been proposed.
In the case of superdistributed document databases for which no standardized management rules for document registration, update, and deletion have not been determined, the following three types of interesting data items can be considered.
Each user who publicizes information opens pieces of information to the public in the form of hypertext documents. Let's call such a set of documents a site.
(1) A site that always offers the information that users are very interested in. As soon as any change or update has been made, users wants to receive notice of the change or update. For example, a site run by a railroad lovers' association falls under this site. Let's call such a site a watch site.
(2) A site where Web pages are frequently updated. The information that users are interested in is not always registered. If pieces of information that users have interest in are present, they want to receive notice of the existence. A site that carries newspaper articles or magazine articles falls under this site. Let's call such site a news site.
(3) A site whose address users don't know in contrast to the above two types of sites whose addresses they know and from which they want to receive notice only when a data item that they are interested in has been registered. Let's call such a site an unspecified site.
Even with the recently proposed information filtering devices, it is impossible to selecting appropriate pieces of information efficiently by switching between those three types of sites suitably and provide users with the selected pieces of information.
When a user has interest in a plurality of topics, the same site or sites and a search condition are given equally to all of the topics. Therefore, it is hardly said that efficient information filtering is being done.
To overcome the shortcoming, when a user has interest in a plurality of topics, it is necessary to set topic by topic not only under what search condition pieces of information are selected but also what site has information on the topic.