SQL/XML has become the standard means to extend SQL to include XML data types and to query XML columns by XQuery. It is supported by most commercial relational database systems, while others, such as Microsoft® SQL Server, support a dialect called SQLXML.
A table in SQL Server may contain columns of type XML. A column of type XML may store XML documents, which can then be queried using XQuery. To speed up the processing of queries against an XML column, a special type of index called an XML index can be created on the column. To create the index, the documents are completely shredded, pulling out every node in the documents and including them in the index. This makes the index very large compared with the original size of the documents (on the order of 3-5 times larger).
Users often were dissatisfied with the size of an XML index because even when user are only interested in querying just parts of a document, the index includes everything in the document. It is well-known that in relational database systems judicious use of materialized views can speed up query processing by several orders of magnitude. In a relational database system that supports XML data, it is therefore important to extend the materialized view mechanisms to queries and views that also involve XML columns. To exploit materialized views, three problems need to be overcome. First, which views to materialize must be determined. Second, it has to be decided which views, if any, can be used to answer a query and how to rewrite the query to make use of the views. Finally, materialized views should be kept current in the presence of updates. In order to be able to query just a portion of the XML document and keep the index size manageable, the second problem (called the “view matching problem”) should be addressed.
Previous work on the XML view matching problem has been very restrictive in the kinds of views that can be created or what can be included in a view. The reason is that there is no point in allowing views that are more complex than the given matching technique can handle.
Some existing techniques are dependent on knowledge of the actual data that is currently available. This is problematic, however, in relational database system if the data is changing over time. A match that was valid at one point in time may no longer be valid at a later time. This prevents the reuse of query plans, something which is standard practice in relational database systems.