The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
XML generation and aggregation based on location paths (e.g., such as XPaths) are common operations that are performed during the evaluation of an XML query (e.g., such as an XQuery query). In many cases, the XML queries specify that the query results need to be returned in a structured XML-based format that may be different from the format in which the returned XML data is stored in the original XML documents over which the query is executed. Processing such XML queries involves generating a query result in which data from relevant XML fragments (identified typically by XPaths) of the original XML documents are embedded in the query result, but are identified by the new XML tags that are specified in the query.
For example, the following XQuery query Q1 is from the XMark benchmark testing standard:
Q1:declare namespace xm=“xmark.xsd”; for $i in distinct-values($auction/xm:site/people/person/profile/interest/@category)let $p := for $t in $auction/xm:site/people/person where $t/profile/interest/@category = $i return <personne>  <statistiques>   <sexe>{$t/profile/gender/text( )}</sexe>   <age>{$t/profile/age/text( )}</age>   <education>{$t/profile/education/text( )}</education>   <revenu>{fn:data($t/profile/@income)}</revenu>  </statistiques>  <coordonnees>   <nom>{$t/name/text( )}</nom>   <rue>{$t/address/street/text( )}</rue>   <ville>{$t/address/city/text( )}</ville>   <pays>{$t/address/country/text( )}</pays>   <reseau>    <courrier>{$t/emailaddress/text( )}</courrier>    <pagePerso>{$t/homepage/text( )}</pagePerso>   </reseau>  </coordonnees>  <cartePaiement>{$t/creditcard/text( )}</cartePaiement> </personne>return <categorie>{<id>{$i}</id>, $p}</categorie>’In the underlying XML documents against which the above query Q1 is executed, a person's gender, age, education, and income are stored in XML elements that are identified by the XPaths “$t/profile/gender/text( )”, “$t/profile/age/text( )”, “$t/profile/education/text( )”, and “$t/profile/@income”, respectively. The above query Q1 returns a query result with “<categorie>” as the root element, in which XPath-based XML fragments (such as, for example, a fragment identified by the XPath “$t/profile/age/text( )”) are embedded. In the returned query result, the embedded XML fragments are identified by the XML tags specified in the query and not by the names of the XML elements in the underlying XML documents over which query Q1 is executed. For example, an XML fragment that provides various data about a person would be identified in the query result of query Q1 by the tag “<statistiques>”, and within this XML fragment the person's gender, age, education, and income would be identified by the tags “<sexe>”, “<age>”, “<education>”, and “<revenu>”, respectively.
Furthermore, many queries specify that XML fragments from various locations within an XML document or across several XML documents need to be aggregated into one result document. Such aggregation of XML data is typically performed in conjunction with, and during, the generation of the results of the queries. For example, the following XQuery query Q2 (also from the XMark benchmark testing standard) is an example of a query for which XML data generation and aggregation are typically performed in conjunction with each other:
Q2:declare namespace xm=“xmark.xsd”;let $ca := $auction/xm:site/closed_auctions/closed_auction returnlet $ei := $auction/xm:site/regions/europe/itemfor $p in $auction/xm:site/people/person let $a :=  for $t in $ca  where $p/@id = $t/buyer/@person  return   let $n := for $t2 in $ei where $t/itemref/@item = $t2/   @id return $t2   return <item>{$n/name/text( )}</item>return <person name=“{$p/name/text( )}”>{$a}</person>’
In many cases, the result of an XML query that requires XML data generation and aggregation can be quite large. This is partly due to the size of the XML fragments that are embedded in the query result.
However, past approaches for XML data generation and aggregation are inefficient and do not scale well for queries that return large results. This is because the past approaches perform the generation and aggregation operations by serializing the entire query result (typically in plain text or in XML 1.0 format) and storing the serialized result into a temporary large object (LOB). Serialization of the entire query result and storing it into a temporary LOB causes a major performance problem, mainly due to the large number of input/output (I/O) disk operations that are involved and the large amount of volatile memory that is consumed. This performance problem is further exacerbated because the past approaches evaluate a query by serializing all the necessary intermediate results, which themselves may be quite large because of the size of the XML fragments embedded therein. According to the past approaches, all the necessary intermediate results are generated and serialized during the intermediate stages of processing the query even though not all of the intermediate results would be returned in the final result of the query.