Many types of data can be stored in a database system. Examples of well-recognized datatypes that are natively used in almost every database systems include strings, numbers, characters, and dates. Database systems also exist which allow users to define non-native data types to be stored and managed in the database. For example, Oracle Corporation of Redwood Shores, Calif. provides a number of database management products that facilitate the definition and use of non-native data types and their associated data access functions.
Database systems typically implement very strong type-checking within the infrastructure used to store and manage information in the database. As just one example, data containers in a relational database, such as a table column, are created and defined to be associated with a specific datatype. Once a column is so defined, only data of the specified datatype can be permissively stored in that database column. It is not normally possible to store data of an undefined datatype within the column. Nor is it possible to store multiple kinds of heterogenous datatypes within a defined column. In addition, conventional database systems also implement strong type-checking for functions and procedures. It is normally not permitted to pass function parameters that are potentially heterogeneous and of different possible datatypes.
Strong type-checking in a database system is often very desirable because many database operations and functions are configured to only work with specific datatypes. If such operations or functions are performed against the wrong datatype, then an erroneous result or fatal computation errors may occur in the database system.
However, strong type-checking may also present a source of inefficiency to a database system. Under certain circumstances, it is not always known in advance the exact datatypes to be used in a database operation. This may occur, for example, if an operation requires the source of data or the contents of data to become known only at execution or run time. If the datatype to be operated upon is unknown, it may be impossible in conventional database systems to predefine functions or operations that will properly access the data. In addition, the datatype(s) of result sets from operating upon the unknown datatypes may likewise be unknown in advance, rendering it impossible to predefine storage structures in the database system to store the anticipated result sets.
Consider if a database application already exists that was built to store and manage information relating to the sale of a first product family. The database application defines a set of storage structures and functions that are specific to managing information about sales for the first product family. Now consider if the user of the database application later wishes to begin selling a second product family, in which similar information must be stored for both the first and second product families, but the exact datatypes used in a database to manage information for each product family differ. Because database systems impose strict type-checking, it is most likely not possible for the existing database application to manage information for both product families. Under this circumstance, a significant amount of effort and resources may be needed to retrofit the database application to work with the additional datatypes associated with the second product family. This exemplifies the type of inefficiencies that may result from strong type-checking when attempting to evolve or maintain an existing database application.
The present invention provides a method and mechanism to store and manage self-descriptive heterogeneous data in a database system. In one embodiment, a generic datatype is defined which encapsulate type descriptions along with the actual data itself. Another generic datatype is defined to encapsulate structural information for new datatypes. By using these generic datatypes to encapsulate heterogeneous data, the database system can be made aware of the exact structure and format of the heterogeneous data. This permits users and the database system to store, manage, and access the heterogeneous data like known datatypes in the system. Other objects, advantages, and features of the invention are described in the Drawings, Claims, and Detailed Description.