The increasing diversity in terms of devices, protocols, networks and user preferences in today's web has made adaptive capability critical for Internet applications. The term “adaptive capability” means having the ability to take web content presented in one form (such as that which would be presented in the form of a website on a desktop computer) and process it to present or display it in another form (such as that which would be presented on a handheld device).
To achieve adequate adaptation, it can become crucial to understand a website's structure and content function, as intended by the author of that website. Most of the previous works in this particular area achieve adaptation only under some special conditions due to the lack of structural information. Some works have attempted to extract semantic structural information from HTML tags either manually or automatically. These approaches, however, lack an overview of the whole website. In addition, these approaches are only suitable for HTML content. Furthermore, most of the approaches do not work effectively even for HTML pages because HTML was designed for both presentational and structural representation of content. Further misuses of structural HTML tags for layout purpose make the situation even worse. Cascade Style Sheets (as set forth in the W3C) attempts to compensate for this situation by representing the presentation information separately, but its application is quite limited. Moreover, the difficulty of extracting semantic structure from HTML tags is still not solved by Cascade Style Sheets. Accordingly, the results of previous semantic rule-based approaches for HTML content are not very stable for general web pages.
Smith et al., in Scalable Multimedia Delivery for Pervasive Computing, Proc., ACM Multimedia 99, 1999, pp. 131–140, proposed a so-called InfoPyramid model to represent the structural information of multimedia content. However, the InfoPyramid model does not exist in current web content. Extensible Markup Language (XML) provides a semantic structural description of content by Document Type Description (DTD). However, a DTD is not a general solution because each application area would necessarily require its own special DTD. Additionally, XML does not take into consideration the function of content. Additionally, although Extensible Stylesheet Language (as set forth in the W3C) provides a flexible way of presenting the same content in different devices, it needs to be generated manually, which would be a labor-intensive work for authors.
Mobile devices that access the World Wide web vary greatly in their capabilities, which makes it quite difficult for web authors to provide web content that can be best presented in all these devices. Most of the web authors today still make use of HTML as their authoring language. Therefore, it would be very useful if an automatic adaptation approach can be realized.
Accordingly, this invention arose out of concerns associated with providing improved methods and systems for website adaptation for mobile devices.