Since its beginnings in the early 1990s, the World Wide Web has become very popular and it now comprises several billions of Web pages including various contents such as texts, images, videos, and links (also referred to as hyperlinks) to other Web pages. The World Wide Web is used daily by billions of Web surfers.
Getting online nowadays is quite simple and requires neither particular skills nor particular proceedings before a national or international office, which in fact does not exist. Surprisingly, no attempt was made to classify the Web in order to group Web sites within families (based upon predetermined criteria), although anyone would benefit from such a classification. Therefore, it is becoming more and more difficult for the Web surfers to retrieve substantive and reliable updated information. Web browsers are of help, of course, but with the increasing number of Web pages, numerous semantic search requests result in raw content which is mostly unclassified, often redundant, inexplicit, and, in the end, simply unworkable.
In the early 2000s, a solution was provided though, called syndication, to help surfers get the right information at the proper moment. In syndication, a section of a Web site is made available for other Web sites to use. More specifically, in Web syndication, content (commonly referred to as Web feed) is put on a Web site in a particular format—often XML-based (XML stands for eXtensible Markp Language), such as RSS (Real Simple Syndication) or Atom—and associated with a feed link which another user (client) can subscribe to in order to retrieve the corresponding content by means of a particular application called a feed aggregator, also referred to as a feed reader or a news reader, running locally on the client's terminal or server.
Having subscribed to a feed, a feed aggregator may be configured to check for and retrieve updated content at predetermined intervals (which may be user-defined). Modern Web browsers often include built-in aggregators, such as iGoogle™ and My Yahoo™. U.S. patent applications No. US 2008/0034058 (Assigned to Marchex, Inc.) and US 2008/0046543 (Assigned to RealNetworks) both illustrate methods for obtaining Web feeds.
Although feed aggregators are a powerful resource for retrieving updated information from the World Wide Web and making it available to an end user via a user-friendly graphical interface (GUI), the volume of articles can sometimes be overwhelming, especially when the user has subscribed to many Web feeds. To address this problem, some feed aggregators include functionalities to allow users to tag the feeds with keywords in order to sort and filter the available articles into easily navigable categories. However this solution is time-consuming, since the user has to do a pre-classification of the feeds from which he wishes to obtain updated content. In addition, tagging Web feeds is simply useless when the content to be retrieved changes subject with each update (such as in newspapers Websites).