1. Field of the Invention
The present invention relates to a technique for efficiently evaluating XPath expressions to specify a certain part of XLM or HTML documents using XPaths (XML Path Language).
2. Related Art
W3C (World Wide Web Consortium) released ‘XPath’ as a recommendation for a path language to specify a certain part of an XML document. XPaths are used as a component of XPointer, XSLT, XQuery, etc., and also used to access a DOM (Document Object Model) tree for an XML document in a predetermined application program.
Evaluating a plurality of XPaths with respect to a single XML document is commonplace in actual information processing with an XML document. In an XSLT style sheet, for example, an XPath expression is specified as a pattern for each template rule. Therefore, a complex XSLT style sheet includes a number of XPath expressions, which need to be evaluated with respect to an XML document to be an objective of processing.
In addition, it is broadly known that a predetermined web page can be reused in various ways and a new application can be developed by adding an annotation to a web page to be used for such purpose as exchanging data on Internet. An XPath is also used for associating an annotation with an element of a web page, because an HTML document used for writing a web page can specify a certain part by using an XPath expression in the same way as an XML document does. An efficient way to add an annotation to a web page is applying a particular annotation to a plurality of web pages for recycling. In this case, it requires to evaluate whether a plurality of XPath expressions in a predetermined annotation correctly specify a predetermined element in a targeted web page or not in order to determine whether the predetermined annotation is applicable to the predetermined web page or not.
A feature of specifying a certain part of an XML document with an XPath is considered to be a condition for checking whether an XML document to be an objective for processing has a certain part specified by an XPath or not. For example, WebLogic Collaborate (http://www.bea.com/index.html), a server system from U.S. BEA uses XPaths to write a condition for routing and filtering of a message expressed in XML. For such a purpose, a plurality of XPath expressions should be evaluated for each time a XML document arrives.
When a plurality of XPath expressions need to be evaluated for each XML document like in the above-mentioned case, an efficient way of evaluating XPath expressions is required. A conventional kind of such technique is performed by writing a condition of subscription for each user with respect to a document written in XML with XPaths, checking the XML document with respect to matching with XPath expressions for each time the document arrives, and then delivering a document that passed the check to a user with a condition of subscription for XPaths (for example, see non-patent-related document 1). This method for evaluating an XPath expression improves an execution time per XPath by evaluating for each step of a location path via searching a table.
Altinel M., Franklin, M., “Efficient Filtering of XML Documents for Selective Dissemination of Information”, International Conference on Very Large Data Bases, 2000.
As mentioned above, when a plurality of XPath expressions are evaluated for a single data file (document) in processing with respect to XML or HTML documents, an efficient way to evaluate XPath expressions is needed.
However, although methods for improving an execution time of an evaluation per XPath as the method disclosed in the above-mentioned document have been known, an execution time required for the entire evaluation linearly increases in proportion to the growth of the number of XPath expressions, which limits shortening of the entire execution time.
This is caused by the fact that conventional evaluation methods for an XPath handle respective XPath expressions independently of each other in evaluating a plurality of XPath expressions.
For a plurality of XPath expressions assumed to be evaluated for a single data file, the expressions are limited with a variation of structures or element values of an objective data file and the like. As a result, the plurality of XPath expressions include similar expressions. Therefore, by retrieving and evaluating a common part from similar XPath expressions and sharing an evaluation result for the common part among the similar XPath expressions, an evaluation of XPaths can be performed quicker than in a way of evaluating a plurality of XPath expressions respectively.
Where a plurality of XPath expressions to be evaluated depend on each other, processing required for an evaluation of XPath expressions can be simplified by taking advantage of the dependency. Dependency among XPath expressions in this context refers to; when a web page specified by a predetermined XPath expression includes two table contents (table [1], table [2]), for example, a relation where table [2] does not exist if table [1] does not exist, i.e., a relation where an evaluation of one or a part of a plurality of XPaths tells evaluation results for remaining XPaths.