With the popularity of the Internet, people are more and more used to obtaining information and searching documents through the Internet. Although there are a huge variety of portal websites becoming operational, it is rather time consuming to switch back and forth between various websites. The search engines may help us to search for anything we want, however, it is burdensome to perform proper searches. Now there is a new information format on the Internet, which is called RSS (Rich Site Summary or Really Simple Syndication). The RSS is an XML-format standard for users to share news headlines and other Web contents, and is also a widely-used content packaging and delivery protocol on the Internet. Using RSS syndicate software tools, Internet users can read contents from websites supporting the RSS output at the client side.
FIG. 1A shows the implementation of existing RSS technology. As shown in FIG. 1A, based on user data and content data, an RSS server generates RSS files (commonly known as RSS Feed, i.e., the feed files or summary files) and sends the RSS files to the client side. The RSS reader at the client side displays to the user the contents of the received RSS files. Using RSS technology, people can subscribe to news and can also subscribe to Blog (commonly known as blog, short for Web Log) and so on. People only need to subscribe the desired contents in an RSS reader, and the contents will automatically appear in the Reader. Further, people do not need to constantly refresh web pages when there is a pressing need to know the news, because once there is any update, the RSS reader will make a corresponding reminder.
FIG. 1B is an RSS reader system structural diagram. As shown in FIG. 1B, for any page that supports RSS feeds, the page can be stored in the RSS database through the RSS subscription feature of the page, and RSS files can be obtained by regular or irregular data acquisitions.
Here, using the RSS2.0 format as an example, the syntax of an RSS file is briefly described below:
<channel> <title>Read/WriteWeb</title> <link>http://www.readwriteweb.com</link> <description>WebTechnology news,reviews and analysis</description> <lastBuildDate>Mon,02 Apr 2007 15:23:00-0800</lastBuildData> <item>  <title>Morfik Patents AJAX Compiler</title>  <description> Morfik Patents AJAX Compiler ...</description>  <link>http://www.readwriteweb.com/...</link>  <category>News</category>  <pubDate>Mon,02 Apr 2007 15:23:00-0800</pubDate >  <author>Richard MacManus</author> </item>......</channel>
Each RSS file is included in a channel tag. Each item has a title tag, a link tag, and description (or attribute) tag. Updates to the RSS are determined by two time stamps. One is the lastBuildDate tag in the channel, which reflects the time of the last change in the RSS contents; and the other is the pubDate tag in the channel, which reflects the time of the publication of the contents. The RSS reader uses both time stamps to determine when new contents exist.
However, to the best knowledge of the applicants, the existing technologies only applicable to web pages supporting the RSS output (such as blog, news, etc.) for providing Internet users with collection (or subscription) functions and, when the web pages collected or subscribed have updates, reminding the users such that the users can timely browse updated contents. However, for web pages that do not support the RSS output (such as Putting-it), the existing technologies may be inapplicable.
The disclosed methods and systems are directed to solve one or more problems set forth above and other problems.