As extensible Markup Language (“XML”) becomes mainstream, it is becoming more difficult to find relevant information from within the growing collections of XML documents. One way of finding information, which has been sufficient in the past for small collections of XML documents, is to perform a full scan of all XML documents in a collection. While a full scan of all documents can be used to find information within the collection, the implementation would be very slow for larger collections due to the scanning of irrelevant documents and irrelevant portions of these documents. Even for smaller collections, a full scan does not allow the user to target his or her search to a particular context. In other words, a full scan will provide all results for any instance of a keyword in the collection of documents instead of providing relevant results in relevant portions of relevant documents.
Another way to find information within the collection involves the use of text keywords. Specifically, many database systems support text indexes that can be queried for certain keywords. However, this technique can only be used to find a small subset of text within the collection of XML documents.
There is a need for an efficient and complete method to perform node-aware full-text searches over XML documents in existing database systems. Current methods for searching XML documents are inefficient, incomplete, provide irrelevant results, and/or search irrelevant documents and irrelevant portions of these documents.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.