1. Field of the Invention
The invention generally relates to computer database systems. More particularly, the invention relates to techniques for composing queries of hierarchical data.
2. Description of the Related Art
Extensible Markup Language (XML) is a widely-adopted standard for describing data. Typically, an XML document stores data using a collection of markup tags to mark the start and end points of XML elements. Using XML, data may be structured and stored in a document, known as an XML document. Each XML element may also contain one or more name-value pairs known as attributes.
Commonly, XML documents may be used to store hierarchical data, meaning that the XML elements are organized into a hierarchical structure, with the elements linked by parent-child relationships. In such cases, the hierarchical data may be modeled as a tree made up of connected nodes. The nodes at one level of the tree may be linked to one or more nodes at a different level, with linked nodes at a higher level referred to as parent (or ancestor) nodes, and those in lower levels referred to as child (or descendant) nodes. A parent node may represent a first element, and a child node may represent either an attribute of the first element, or a second element nested inside the first element. XML documents may include a single node at the top level of the tree, referred to as the root node. The XML Path Language, known as XPath, is an expression language that provides the ability to access nodes of a tree hierarchy. One type of XPath expression is a path expression, which is written as a sequence of steps to get from one node of the tree to another node.
In some situations, a query tool may be provided to users wishing to create queries of hierarchical data. In one approach, the query tool may allow the user to specify a text keyword search to locate any data records that match one or more of the specified keywords. However, this approach can often return inexact results. That is, since the keywords may be present in nodes of the hierarchy which are of no interest to the end user, the query results may include many records that are not useful.
In another approach, a query tool may be configured by expert users (i.e., developers) to enable users to specify predicates for specific nodes of the tree structure. This approach requires that, as part of configuring the query tool, the developers identify certain key nodes that may be queried by the end users, as well as the paths required to access each of the possible combinations of the key nodes. Such a query tool allows an end user to perform an exact search, meaning that the query returns the data records having the precise match to the specified predicates. This approach is most common in situations where there are a limited number of nodes that the user may wish to query.
However, where the tree structure is complicated (i.e., having many levels and nodes), or where there are many nodes that an end user may wish to query, the number of combinations of nodes, and thus the number paths required to be mapped, can rapidly become too large to work with. Consequently, in such situations, this approach becomes impractical.
Accordingly, for the reasons discussed above, there is a need for improved techniques for composing queries of hierarchical data.