Internet based news aggregators are well-known in the art. An example of a contemporary system is that provided by Google at its News site, shown generally in FIG. 7. Google News automatically gathers stories from an assortment of news sources worldwide, and automatically arranges them into a variety of categories/topics as shown in FIG. 7. Typically speaking systems such as this are designed to present what they deem the most relevant stories within the interface shown in FIG. 7, using automated algorithms which measure human interest/relevance of individual news stories. This is done primarily by identifying a number of factors, including determining the quality of the source of the news, page views, search queries and personal preferences as explained in US Publication Nos. 20050165743; and 20050060312; all of which are incorporated by reference herein.
Google News also automatically updates the topics and news stories on a periodic basis. One limitation of such system, however, is that there is no (apparent) discrimination made by the Google news algorithm to sort the stories in actual chronological order within the main news page. For this reason, as seen in FIG. 7, the main story highlighted for the Georgia-VA Tech football game is entitled “Preview” and is dated some 18 hours ago. In fact, the story beneath such highlighted entry is more recent and gives the actual outcome of the contest: Georgia in fact has already won the game. Accordingly the Google News aggregator, while compiling relevant content, tends to accumulate a lot of stale content which is not very timely but which nonetheless is prominently displayed because of the algorithm computes importance.
At the same time it should be noted that by selecting the entry in FIG. 7 one can see a more comprehensive listing of stories from the news aggregator, including a chronological sort of the same. However even this aspect of the aggregator has limitations, because while the news stories are identified by their release time (i.e., 1 hour ago), this parameter is not in fact helpful for identifying the actual temporal quality of the content of certain stories. This is because many news agencies/sources release stories which merely duplicate content from earlier stories, and with little or no new added content relevant to a story therein. These repeated stories can bear a recent time stamp and thus be pushed (incorrectly) to the top by the Google temporalizer to suggest that they are very recent.
An example of the duplication of content can be seen in FIG. 8, in which the top four stories, as sorted by the Google chronologizer, actually contain identical content even though they were time stamped with different recency values. This figure also shows that these four stories actually duplicate content dealing with the governor of California which was already first extracted some hours earlier from the Salon news source. FIG. 8 also depicts the problem noted above, namely, that the story shown with the dashed arrow (from Monsters & Critics) actually has newer content not found in the identified most recent articles. This can be confirmed from examining FIGS. 8A and 8B; the M&C article clearly evidences additional recent content relating to the governor's hospitalization.
The effect is particularly pronounced during the time in which certain events (or their reporting), such as sporting events, elections, natural catastrophes, accidents, are taking place. That is, the updating of scores is something that tends to lag significantly behind other stories. This makes it hard to review the news at a glance and immediately identify the current state of the certain events.
The situation is exacerbated by overseas news bureaus which pick up US news stories and then repeat them verbatim at a later time. For instance a sporting event may start at 5 p.m. PST in the US and end at 8 p.m. The news then is disseminated overseas, and then reported on by several foreign sources during their respective days. So as a practical matter, at 11:00 a.m. PST the next day, the foreign news source stories describing the kick-off the game (not the result) are just being published fresh in their respective domains. From the perspective of the Google type algorithms, which only appear to examine explicit time references, the foreign stories describing the beginning of the game appear more recent than stories describing the result. The result is an aggregation of content that is mismatched in time.
Moreover the same lack of temporal relevance problem also exists with search engines purporting to render relevant results to users. While such systems typically include some mechanism for selecting “recent” content, there is no mechanism available to ensure that such content is indeed fresh and not simply a repeat of older, stale material. A similar situation can be found in the Blogsphere as well, where it is not easy to determine the actual temporal relevance of material.
An example of this problem is seen in FIGS. 9 and 9A of the prior art. Here a query made to “Sharks Hockey Score” late in the evening on December 28 reveals nothing useful in fact concerning the game which has just completed against their opponent of the evening: Phoenix. No matter how the stories are sorted, by relevance or date, there is no information about the game which the subscriber can glean, even though the game had concluded and at least one news source had reported the final score.
To get such information one must leave the news aggregator and visit another site, a fact, of course, which is undesirable from the perspective of trying to maintain the user's attention on the news aggregator. The problem is exacerbated with smaller computing devices and cellphones as well, where display space is limited.
Accordingly there is clearly a long-felt need for a temporal-based document sorter which is capable of addressing these deficiencies in the prior art.