Two main standards of describing and querying data have evolved. One of these standards is based on a relational model that is used by most modem databases. The other is based on a hierarchical model, examples of which include, XML (Extensible Markup Language), XML Schema Language (XSD) and XQuery Language.
XML is a specification language created to describe data interchange formats and data semantics. An XML document consists of data annotation tags that represent relationships between data values. An XML schema is an auxiliary document describing the structure of an XML document making it easier to interpret. XQuery is a language for querying information from XML documents.
Before the inception of XML, the majority of data was stored in relational tables. A relational table is a data structure that represents a mathematical mapping between one or more types of data. Relational databases store information by organizing data in normalized tables where the stored information can be retrieved through querying languages based on Relational Algebra, an example being Structured Query Language (SQL).
As XML continues to gain popularity, the need for effective integration of hierarchical data expressions and relational data expressions grows. Effective integration between the two has proven difficult because of key differences between them. For example, XML documents organize data in a hierarchical structure with multiple levels of nesting, while the relational model organizes data in flat tables with inter-table functional dependencies. Additionally, in hierarchical data expressions, document order of a node (the position each node occurs in the document) is important, while in relational data expressions document order is not relevant.
Previous attempts have been made at developing techniques to effectively integrate hierarchical data schemas and relational data schemas. These attempts have suffered from problems such as excessive use of the computationally very expensive “join” operation. Such attempts include, XML shredding, as described in P. Bohannon et al., “LegoDB: Customizing Relational Storage for XML Documents,” 2002, mapping XML data values to a set of predefined tables based on node type, and mapping XML data to a relational table by number-encoding each of the XML data values.
Attempts have also been made to convert queries written over hierarchical data into queries over relational data. These attempts have suffered shortfalls similar to those described above. These attempts are described in Y. Diao et al., “Towards an Internet-Scale XML Dissemination Service,” VLDB, 2004, and C. Koch et al., “FluXQuery: An Optimizing XQuery Processor for Streaming XML Data,” VLDB, 2005. These attempts include a pure XML engine to handle processing, and translating hierarchical queries into relational queries.
Various techniques have been proposed for specifying “continuous queries” over steams. In these environments, data is not fixed, but arrives one message at a time in one or more continuous streams. Queries define views over the entire history of one or more streams. Rather than receiving a single result set, subscribers to continuous queries receive a continuously updated result set reflecting how the view changed as a result of the changes to the streams on which it depends. In a mixed environment, any of the following combinations are possible: schemas defined in a relational (SQL) or hierarchical style (XML); messages delivered in a relational (flat) or hierarchical format (XML); and queries written in a relation language (SQL) or hierarchical language (XQUERY/XSLT).