In 2011, over two billions of Web users were reading contents of more than 14 billion Web pages in numerous languages, formats and layouts. Thanks to a variety of technological, economical, aesthetical and other factors, modern designs of Web pages are becoming increasingly sophisticated; examples of diverse content elements simultaneously present on Web pages include articles, navigation panels, text and link boxes, images and icons, advertising buttons, banners, dynamic and interactive advertising contents, embedded video clips, overlays, pop-ups, floating elements, forms, etc. While each of these elements has its own place within integrated Web experience, rich Web layouts may interfere with an immediate goal of uninterrupted and concentrated reading by a Web page visitor of item(s) presented on the page, such as news or research article, blog post, list of search results, article headings, and other principal content types. Some Web sites, such as Yahoo News slideshows, deliver separate clean views of articles after clicking on news items. In addition, browser features, such as Safari Reader on iPad, attempt to extract significant pieces of content from arbitrary Web pages. However, existing methods of processing complex Web layouts do not work exceptionally well for many types of Web page content, such as search lists and headings, multipage articles, etc. The existing techniques do not consistently extract important components of key page items, such as article title, author and publication date, and have difficulty processing international pages. Accordingly, it would be desirable to provide enhanced systems that simplifies Web page layouts for viewers, determining principal page content, presenting it to user in best readable format, and allowing easy and convenient clipping to personal note sets such as the Evernote product provided by Evernote Corporation of Redwood City, Calif.