Websites, spread through the internet, provide various types of content formats. The websites may be updated on various time-frequencies. To see the new updates of a chosen website, users may be required to re-enter the website on occasions and search through browsing the websites' pages or search through a main page that includes tittles to see if any title had been changed. For example, a news website, such as “Times” website in which the main page comprises most of the titles of the subjects and most the titles of the articles. A user that wishes to see updates only in the sports section may be required to enter the “Times” website occasionally to see if there are any new articles in the sports section.
Today there are protocols that allow automatic reading of the updated parts of a website, such as Rich Site Summary (RSS) protocol that allows automatic retrieving of updated content, called “RSS feeds”. However, the RSS protocol can only be used with websites that include RSS channels, providing that the user has an RSS application installed in his (or her) computer.
RSS readers or “aggregators” are software applications that usually use RSS or Extensible Markup Language (XML) formatted data to execute the updates check—when comparing the XML code of the saved data with the updated one, at predetermined time intervals.
A U.S. Pat. No. 6,976,210 by FREIRE SILVA JULIANA, ANUPAM VINOD, BREITBART YURI J and KUMAR BHARAT (“F.A.B.K”), discloses a system and a method for creating a personal web view by creating a plurality of web clippings. The personal web view contains a plurality of elements that are taken from source pages and clipped to the web view. The elements can be selected by the user using an interface. F.A.B.K allow updating the personal web view, using a predetermined update frequency.
Although F.A.B.K's solution allows analyzing the actual HTML code of the source web pages, the analysis is carried out by translating the pages' Hyper Text Markup Language (HTML) code to an Extensible Markup Language (XML) code. Since many of the HTML pages are not well formatted, due for example, for missing tags, F.A.B.K disclose a method by which the HTML code of the pages are translated to XML code [F.A.B.K, paragraph 0030]. This leads to translating or identifying and displaying only the “valid” elements from the HTML code.
To allow the user to mark the elements from the source page, the system of F.A.B.K displays the XML code or the Document Object Module (DOM) tree of the source page and while the user selects the elements or “objects” from the display of the DOM tree—the selected elements are highlighted in the display of the source page [F.A.B.K, paragraph 0030]. Since the DOM tree is a display of the XML code—the excluded not-well-formatted HTML elements are also excluded from the display of the DOM tree.
F.A.B.K's invention does not allow handling of not-well-formatted HTML code elements and therefore may leave out some of the content of the page. Furthermore, F.A.B.K's analysis of the source pages does not allow the user to select one element or item from a preset list of items (defined as so in the HTML code) to set up the update check of the entire list. The invention's analysis engine does not identify other elements of the same data list according to a selection of a part of the said list. According to F.A.B.K the user has to specifically select each element from the page (selecting through the DOM tree) of which the user wishes to view updates from.
Additionally many of the source web pages include scripts such as java scripts, VB scripts and the like) to render content to the page. The data that is rendered to the HTML code of the source web page does not appear as normal or “well-formatted” HTML tags. Since F.A.B.K convert the HTML to XML it will skip any scripts that the page contains along with other parts of the page's HTML code that are not well formatted.