In the related art, a structured document management device for storing and searching for structured document data described in an extensible markup language (XML) or the like is known. To allow a structured document management device to search for structured document data, like a query language SQL in a relational database management system (RDBMS), an XML query language (XQuery) for XML data is defined and is supported in many structured document management devices.
XQuery is a language for treating an XML dataset like a database, and means for acquiring, aggregating, and analyzing a dataset that meets certain conditions are provided. Since XML data has a layered logical structure (hierarchical structure) in which elements such as a parent, a child, or siblings are combined, conditions (structural conditions) for this hierarchical structure can be designated as conditions.
A search technique that checks whether XML meets designated structural conditions is provided by simple API for XML (SAX) or the like, which is a typical parse process for XML data. However, according to SAX, in a structured document (in this example, XML) to be searched for, it is only possible to access a lower layer from a higher layer. Thus, when there is a refining condition that is designated on a lower layer, it is not possible to apply the refining condition unless the lower layer is traced. Thus, it is necessary to trace from the top layers of all structured documents to the lower layers when there is a refining condition.
In order to accelerate a search process of a structured document management device, it is preferable to apply the refining condition as early as possible to reduce intermediate data that is produced during the search. Thus, a technique that searches for a structural condition for only a structured document set that is refined using an indexing process is also known. However, this technique is not compatible with a nested query in which the query is made up of a plurality of subqueries.
As for the nested query, various attempts have been made so as to apply a refining condition over subqueries at an early stage. For example, in an RDB model, a technique in which the relation between subqueries that constitute a query is defined in a graph form, a condition for allowing a predicate to be moved between graphs is defined, and if possible, the predicate is moved to another subquery to achieve optimization is known.
However, the RDB model does not have a problem associated with a hierarchical structure and an order relation between elements unlike the structured document data model and does not support element identifier (ID). Thus, there is a case where the approach to rewrite queries in the RDB cannot be applied to the structured document.
Moreover, in queries of XML which is a structured document, by copying conditions for a virtual XML document called a view immediately before and after creating the view and applying the conditions, it may be possible to eliminate the need to unnecessarily trace the elements of the view of the structured document. However, besides the fact that the target is limited to the view, since the target conditions are simply copied to all possible locations at the stage of creating the view and are applied, the same conditions are evaluated at a plurality of locations. Further, when there is a plurality of copying conditions, since the logical sum of the predicative conditions are copied unconditionally, there is a problem in that the effect of refinement by the predicative conditions is not sufficient. An object of the present invention is to provide a structured document management device, method, and program capable of searching at a high speed.