Some database systems support multiple database representations for an abstract datatype. An abstract datatype is a datatype recognized and defined by a database system and having one or more physical representations within the database system by one or more other datatypes recognized by the database system. As used herein, the term “database representation” refers to the combination of any base structures that are used to store data for the abstract datatype and any indexes on the base structures. For purposes of illustration, an XML (extensible Markup Language) datatype will be used as an example of an abstract datatype for which a database system supports multiple database representations.
Different examples of base structures that a database might support for XML include, but are not limited to, object relational storage (O-R), LOB (Large Object), CLOB (Character LOB), BLOB (Binary LOB), CSX, and binary. In addition, a database might support a hybrid base structure in which a structured part of the XML is stored object relationally and an un-structured part of the XML is stored in CLOB or CSX form. Continuing with the XML example, the database system might support different indexing options for XML. Examples of different indexing options include, but are not limited to, a B+ tree, a bitmap index, an XML Index and an XML Table Index. An XML Index and an XML Table Index are discussed in the XML Index Application and the XML Table Index Application.
Thus, some database systems support multiple database representations for an abstract datatype. Clearly, some database representations are better than others for certain use cases. Moreover, some indexes are only appropriate for certain base structures. However, determining a suitable database representation can be difficult, as many factors affect the decision. For example, when XML is stored object relationally, the XML may be decomposed into a set of object relational tables such that a B+ tree index or bitmap index can be created to speed up user queries. However, when XML is stored in CLOB or CSX form, both XMLIndex and XMLTableIndex can be created to speed up the query.
Furthermore, as XML covers a wide spectrum of data, from both structured data to semi-structured to un-structured data, the selection of an appropriate database representation is tedious and error prone. For example, a user might analyze XQuery and/or SQL/XML statements manually, and then decide on a base structure and index type. The user then performance tunes the XQuery and/or SQL/XML statements.
Clearly, this trial and error approach is not a scalable solution. Furthermore, when an inappropriate base structure or index choice is made, query performance suffers drastically.
Therefore, improved techniques are desired for determining a database representation for an abstract datatype that can be stored in more than one database representation.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.