1. Technical Field
The present invention relates to an XPath evaluation, and more specifically, relates to a method and system for XPath evaluation in XML data repository.
2. Discussion of the Related Art
XPath is a query language provided for addressing nodes in XML files. At present, the volumes of data encoded in XML format is increasing ferociously. Therefore, how to evaluate an XML-based query process efficiently, i.e. how to execute an XPath evaluation efficiently, for huge amounts of XML data becomes a big challenge for a person skilled in the art. A person skilled in the art has made many attempts in this respect.
There are generally two known methods to evaluate an XPath query.
In a first method, the XPath language is first transformed into the SQL language, and subsequently a query is made in a database based on SQL. For example, a technical solution of querying based on Oracle XML DB is disclosed in Muralidhar Krishnaprasad, then Hua Liu, Anand Manikutty, James W. Warner, Vikas Arora, Susan Kotsovolos, “Query Rewrite for XML in Oracle XML DB”, Proc. VLDB. 2004; a technical solution of querying based on SQL Server 2008 is disclosed in Shank Pal, Istvan Cseri, Oliver Seeliger, Michael Rys, Gideon Schaller, Wei Yu, D. Tomic, A. Baras, Brandon Berg, Denis Churin, “XQuery Implementation in a Relational Database System”, VLDB. 2005; a technical solution of querying based on BEA XQuery Processor is disclosed in Daniela Florescu, Chris Hillery, Donald Kossmann, Paul Lucas, Fabio Riccardi, Till Westmann, Michael J. Carey, Arvind Sundararajan, “The BEA/XQRL Streaming XQuery Processor”, Proc. VLDB, 2003; a technical solution of querying based on Open-source XML DB is disclosed in “Oracle Berkeley DB XML”, 2009 (http://www.oracle.com/database/berkeley-db/xml/index.html); and a technical solution is also disclosed in Q. Li, B. Moon, “Indexing and Querying XML Data for Regular Path Expressions”, VLDB 2001 and M. YoshiKawa, T. Amagasa. XRel, “A Path-based Approach to Storage and Retrieval of XML Documents using Relational Databases”, ACM Transactions on Internet Technology, 2001.
However, there are problems associated with the first method. For example, it is hard for this method to maintain the changes in an XML schema. In this method, if an XML schema is changed, the structures of the tables in the database are intended to change, and the mapping relationship between the XPath query and the SQL query is also intended to change. Those changes are always complicated and time-consuming, and are likely to cause errors. Additionally, in the first method, the cost for an SQL executing join operation is significant.
In a second method, an XPath is evaluated directly for each XML instance. For example, a technical solution of querying based on IBM DB2 is disclosed in Matthias Nicola, Bertvander Linden, “Native XML Support in DB2 Universal Database”, VLDB. 2005 and Guogen Zhang, “Building a Scalable Native XML Database Engine on Infrastructure for a Relational Database”, XIME-P 2005; and a technical solution is also disclosed in Haifeng Jiang, Hongjun Lu, Wei Wang, Jeffrey Xu Yu, “Path Materialization Revisited: An Efficient Storage Model for XML Data”, AICE2000 and H. Jiang, W. Wang, H. Lu, J. Xu Yu, “Holistic Twig Joins on Indexed XML Documents”, VLDB. 2003.
However, there are also problems associated with the second method. For example, in the second method, it is necessary to calculate a context for each XML instance, such that the evaluations are expensive.
It is desirable to provide an efficient XPath evaluation technique that addresses the above-noted problems.