Web pages located on the World Wide Web and accessed via the Internet include a variety of content including text, images, and other forms of multimedia. These web pages are often divided into multiple portions or regions by horizontal lines, vertical lines, and frames. These lines are referred to as “separator lines.”
When viewed in terms of web page design, content located within the different regions of the web page defined by the separator lines have different semantic meanings (i.e., the relationships of characters or groups of characters to their meanings, independent of the manner of their interpretation and use) or document functions (e.g., a portion of an article or a sidebar). Being able to detect separator lines within the web pages is very useful in subsequent processing of a web page including, for example, web page printing, block level based web page searching, web page segmentation, and many other applications.
Throughout the drawings, identical reference numbers designate similar, but not necessarily identical, elements.