The development of the Internet has created an unprecedented opportunity to collect and disseminate information. For example, news articles from hundreds of sources, including major news organizations such as Reuters, United Press International, but also large numbers of small, local newspapers, are accessible through the Internet. In fact, virtually anyone can disseminate virtually any information content through the internet. A disadvantage of this situation is that a user desiring to research a particular topic of interest must navigate through possibly thousands of information items (e.g. web pages, news articles, downloadable documents etc.) in order to locate the few items that actually contain information that is relevant to their purposes. The difficulty of separating relevant and irrelevant information has traditionally been the primary impediment to the use of the internet for serious research.
Numerous search tools have been developed to facilitate the identification of relevant information items through the internet. Various search strategies are used by these search tools, such as, for example, keywords, Boolean operators, and syntactic analysis. Most of these strategies calculate some form of “relevancy score”, which attempts to rate the “goodness of match” between an information item and the search criteria provided by the user.
When used by a skilled researcher, the known internet search tools can identify and retrieve information items that are highly relevant to the topic of interest. In this respect, the term “skilled researcher” refers to a person skilled in the use of the search tool(s) in question. This imposes a limitation in that successful use of the most sophisticated search engines, which are capable of generating the best search results, require a skill level beyond that of most users. In many cases, the user will be an expert in a field related to the information they are searching for, rather than the techniques needed to find that information. In order to overcome this limitation, various commercial search services (such as, for example, Factiva™, Dialog™, etc.) provide research consultants, who assist a user in developing the criteria needed to produce the desired search results. However, these research consultants can dramatically increase the cost of using the search service, which is undesirable.
Another limitation of known search tools is that they tend to produce the best results when the information of interest to the user can be narrowly defined. This enables highly targeted searches to be designed, and assists in identifying relevant information items. However, in some cases, a user may not be able to provide a narrow indication of what they are looking for. For example, a public health official may be interested in published news articles which refer to any infectious disease, or a class of diseases. Such a broadly defined field of interest will almost inevitably yield a great many news articles, most of which will be of no particular interest to the user.
Furthermore, once an information item of interest is found, the user may be particularly interested in other information items that are relevant to the first information item. Normally, this cannot be accommodated by the search tool without revising the search criteria, which will often be undesirable.
A still further limitation of known search tools is that they do not adequately handle time sensitive information. For example, a news article referring to a patient being admitted to a hospital with unusual symptoms may provide public health officials with an “early warning” of an outbreak of an infectious disease. In such a case, timely identification and dissemination of that article to interested public health experts is critical. Furthermore, timely delivery of closely related articles (i.e. those referring to the same location, similar symptoms etc.) can also be critical to identifying and/or tracking the outbreak. While known search tools can identify information items that were published (or otherwise made accessible through the internet) within a selected time range, the rapid dissemination of relevant information items to the interested users is not adequately addressed.
Thus a method and system capable of rapidly aggregating time-sensitive information from multiple heterogeneous sources, assessing the relevance of the aggregated information, and then distributing the information to interested users, all with minimum time delay, remains highly desirable.