1. Statement of the Technical Field
The present invention relates to the field of Web content transcoding and more particularly to generating XPATH expressions.
2. Description of the Related Art
End-users increasingly access Web content with devices other than conventional desktop content browsers. Such devices include personal digital assistants, cellular telephones and cable television set top boxes. Yet, as these devices lack the same rendering capabilities as the conventional desktop content browser, it is necessary to adapt the Web content from one format intended for use in one type of device, to a another format suitable for rendering in another device. This content adaptation process has been referred to as “transcoding”.
The transcoding process can be facilitated through the use of information about the Web content, referred to hereinafter as “meta-information”. Meta-information can be provided with the original Web content and can be used to assist the transcoding process in uniquely identifying portions of the Web content. Notably, meta-information can be created without any modification of the original Web content if the meta information is described separately from the Web content. In this regard, the separate provision of such meta-information often is referred to as “external annotation”.
External annotations consist of the meta-information and corresponding references to portions of the original Web content. The meta-information and references typically are described according to the Resource Description Framework (RDF) and the XML Path/Pointer (XPath/XPointer) specification. XPath is a syntax for identifying particular sections of markup, such as an HTML or XML formatted document. Each of the RDF and XPath/XPointer specifications have been standardized by the World Wide Web Consortium, referred to hereafter as the “W3C”.
XPath, described in depth in James Clark and Steve DeRose, XML Path Language (XPath) Version 1.0, W3C Recommendation (Nov. 16, 1999), arose from an effort to provide a common syntax and semantics for functionality which is shared between Extensible Style Sheet Transformations (XSLT) and XPointer. A primary purpose of XPath is to address parts of an XML document in support of which XPath provides basic facilities for manipulating strings, numbers and boolean values. XPath uses a compact, non-XML syntax to facilitate the use of XPath technology within Universal Resource Indicators (URI) and an XML attribute value. XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax. Thus, XPath is aptly named in view of its URL-like path notion for navigating through the hierarchical structure of an XML document.
Notably, XPath expressions can be difficult to create. The XPath standard syntax requires an understanding of complex concepts, including multiple axes and predicates. As will be recognized by one skilled in the art, the XPath syntax plainly is unusual and non-intuitive. Importantly, though creating simplistic XPath expressions can be problematic, creating robust XPath expressions which remain valid notwithstanding changing portions of referenced markup can be even more so problematic. In particular, conventional XPath creation techniques are not configured to handle changing content relied upon as a reference point in associated markup.
For example, the structure and content of hypertext markup language (HTML) documents are known to change with time as the information contained in the HTML document sometimes can be updated hourly or daily. As the contents and structure of the document changes, however, associated annotations which uniquely identify those changed portions of the HTML document can become invalid. This can be particularly true where specific annotations uniquely identify portions of the changing HTML document by reference to a specific document structure. Hence, conventional annotation methods are ineffective in the face of a dynamically changing document.