XML is a versatile markup language, capable of labeling the information content of diverse data sources including structured and semi-structured documents, relational databases, and object repositories. As increasing amounts of information are stored, exchanged, and presented using XML, the ability to intelligently query XML data sources becomes increasingly important. One of the great strengths of XML is its flexibility in representing many different kinds of information from diverse sources. To exploit this flexibility, an XML query language must provide features for retrieving and interpreting information from these diverse sources. A query language that uses the structure of XML intelligently can express queries across all these kinds of data, whether physically stored in XML or viewed as XML via middleware.
A query language called XQuery, is designed to be broadly applicable across many types of XML data sources. XQuery is designed to meet the requirements identified by the W3C XML Query Working Group. It is designed to be a language in which queries are concise and easily understood. It is also flexible enough to query a broad spectrum of XML information sources, including both databases and documents. For example, XML-enabled database management systems, such as IBM DB2, Oracle DBMS (10 g Release 2), and Microsoft SQL Server allow XML data to be queried using XQuery language. XQuery operates on a data model called the XQuery Data Model. Additional details regarding the XQuery Data Model may be found at http://www.w3.org/TR/xpath-datamodel.
A recent extension to XQuery language called “XQuery Update Facility” allows XML data to be modified (i.e. updated). The XQuery Update Facility, described at http://www.w3.org/TR/xqupdate/, is designed to meet the requirements for updating instances of the XQuery/Xpath Data Model (XDM). The XQuery Update Facility provides facilities to perform any or all of the following operations on an XDM instance: insertion of a node; deletion of a node; modification of a node by changing some of its properties while preserving its identity; and creation of a modified copy of a node with a new identity.
XML data is hierarchical (i.e. structured) and is often represented by a tree of nodes. For example, FIG. 2 shows an exemplary XML document, representing a list of employees by name and position. FIG. 3 shows a node tree representation of the XML document shown in FIG. 2. FIG. 3 shows five different kinds of nodes. The XQuery Data Model actually defines seven kinds of nodes at http://www.w3.org/TR/xpath-datamodel/#Node. The seven kinds of nodes are: document, element, attribute, text, namespace, processing instruction, and comment.
The XQuery Update Facility has defined the following update operations, also called update-primitives, which can be applied to different nodes of an XML document or a fragment thereof:                1. upd:insertBefore—This operation allows one or more nodes to be inserted before the given node.        2. upd:insertAfter—This operation allows one or more nodes to be inserted after the given node.        3. upd:insertInto—This operation allows one or more nodes to be inserted as children of given node.        4. upd:insertIntoAsFirst—This operation allows one or more nodes to be inserted as first children of given node.        5. upd:insertIntoAsLast—This operation allows one or more nodes to be inserted as last children of given node.        6. upd:insertAttributes—This operation allows one or more attribute nodes to be inserted as children of given element node.        7. upd:replaceNode—This operation allows a node to be replaced by one or more nodes.        8. upd:deleteNode—This operation allows a node and all its descendents to be deleted.        9. upd:replaceValueOf—This operation allows value of an attribute/text/processing-instruction/comment node be replaced with a string value.        10. upd:replaceElementContent—This operation replaces non-attribute children of an element with a string value.        11. upd:rename—This operation allows element/attribute/processing-instruction nodes to be renamed.        
In accordance with the XQuery Update Facility, update operations on a given input XML fragment are performed in the following phases:
Phase 1: Collect all the update operations in a pending update list, also called a PUL, which is a list of triplets. Each triplet consists of three items: (a) reference to target node, i.e. node to be modified; (b) operation kind (one of the 11 operation kinds described above; and (c) new value, e.g. upd:replaceValueOf, which replaces a node's value with a new string value, or upd:replaceNode, which replaces a node with a new sequence of nodes.
Phase 2: Check the update operations in PUL for compatibility as follows:                A target node cannot participate in two upd:rename operations.        A target node cannot participate in two upd:replaceNode operations.        A target node cannot participate in two upd:replaceValueOf operations.        A target node cannot participate in two upd:replaceElementContent operations.        
Phase 3: During this phase update operations are applied on an input XML fragment in multiple passes. Each pass is a traversal of the XML fragment being modified.
Pass 1: All upd:insertInto, upd:insertAttributes, upd:replaceValue, upd:rename, and upd:delete primitives are applied. upd:delete just marks the node for deletion instead of completely deleting it.
Pass 2: All upd:insertBefore, upd:insertAfter, upd:insertIntoAsFirst, and upd:insertIntoAsLast primitives are applied.
Pass 3: All upd:replaceNode primitives are applied.
Pass 4: All upd:replaceElementContent primitives are applied.
Pass 5: All nodes marked for deletion are deleted.
Phase 4: If, as a net result of the above passes, some node contains an adjacent text node children, these adjacent text nodes are merged into a single text node. Note that a text node is a node that encapsulates SML character content. Text has the following properties: it has content; it optionally can have a parent node; and if the parent of a text node is not empty, the text node must not contain the zero-length string as its content.
Additional details of these four phases may be found in the XQuery Update Facility document at http://www.w3.org/TR/xqupdate/, section 3.2.2 upd:applyUpdates.
It will be appreciated by those skilled in the art that applying update operations in four phases is not very efficient. In particular, applying Phase-3 in five passes over the XML document is very inefficient. This inefficiency has a number of disadvantages for the processing of XQuery updates, such as increased computational burden and slower updates.
Accordingly, there is a need for systems and methods for increasing the efficiency of the processing of XQuery Updates so that such updates can be processed faster and with a lower computational burden.