A multi-source feed reader aggregates content from multiple syndicated feeds. Typically, multi-source feed readers are web based, such that syndicated information such as news, blogs, and/or similar items from multiple feeds are aggregated at a single web site for convenient access. With the popularity of multi-source feed readers such as Google Reader, NetVibes, BlogLines and even Microsoft Outlook, it is apparent that blogs have become an important source of information for many people, especially those who are more technical.
A problem with getting one's news and other information from a multi-source feed reader is that many users subscribe to one more than one feed on a single topic (e.g., both Gizmodo and Engadget on consumer electronics devices, or both Mashable and TechCrunch on Web 2.0 issues). This would not be a problem but for the fact that most blogs and many other web information providers derive their content from the same sources: other blogs, online news sites and mainstream media (newspapers, television, etc.). Typically, a single party breaks a story, and the every other online information provider repeats the story in slightly modified language. This results in multi-source feed readers providing the same information to users multiple times, in slightly different forms.
Note that the problem is not that a multi-source feed reader provides an identical article or blog entry multiple times. Conventional duplicate elimination functionality can eliminate exact duplicates of individual items (i.e., multiple copies of the same article by the same author). The problem is that the multi-source feed reader provides multiple articles, blog entries and the like to a user, each of which is from a different source and is not a word for word copy, but which is based on a common underlying original source (either directly or indirectly), and thus contains essentially redundant information in somewhat different language. As such, the multiple items are cumulative and redundant to each other. Although such an article may at first appear to a user to be new information, upon reading it the user realizes that it is essentially the same as the other articles to which it is cumulative, although its format and exact wording varies.
It would be desirable to address these issues.