The increasing diversity in terms of devices, protocols, networks and user preferences in today's web has made adaptive capability critical for Internet applications. The term “adaptive capability” means having the ability to take web content presented in one form (such as that which would be presented in the form of a website on a desktop computer) and process it to present or display it in another form (such as that which would be presented on a handheld device).
To achieve adequate adaptation, it can become crucial to understand a website's structure and content function, as intended by the author of that website. Most of the previous works in this particular area achieve adaptation only under some special conditions due to the lack of structural information. Some works have attempted to extract semantic structural information from HTML tags either manually or automatically. These approaches, however, lack an overview of the whole website. In addition, these approaches are only suitable for HTML content. Furthermore, most of the approaches do not work effectively even for HTML pages because HTML was designed for both presentational and structural representation of content. Further misuses of structural HTML tags for layout purpose make the situation even worse. Cascade Style Sheets (as set forth in the W3C) attempts to compensate for this situation by representing the presentation information separately, but its application is quite limited. Moreover, the difficulty of extracting semantic structure from HTML tags is still not solved by Cascade Style Sheets. Accordingly, the results of previous semantic rule-based approaches for HTML content are not very stable for general web pages.
Smith et al., in Scalable Multimedia Delivery for Pervasive Computing, Proc., ACM Multimedia 99, 1999, pp. 131-140, proposed a so-called InfoPyramid model to represent the structural information of multimedia content. However, the InfoPyramid model does not exist in current web content. XML provides a semantic structural description of content by Document Type Description (DTD). However, a DTD is not a general solution because each application area would necessarily require its own special DTD. Additionally, XML does not take into consideration the function of content. Additionally, although Extensible Stylesheet Language (as set forth in the W3C) provides a flexible way of presenting the same content in different devices, it needs be generated manually, which would be a labor-intensive work for authors.
Accordingly, this invention arose out of concerns associated with providing improved methods and systems for website adaptation.