A release of a new standard for the Structured Query Language (SQL), SQL-99, has increased the need to add object relational features to existing database system (DBS) commercial products. One of the SQL-99 object relational features is the User Defined Type (UDT). The UDT feature provides users of the DBS product the ability to define their own data types (the UDTs) install them into the DBS, and to use them as part of their DBS schema design (e.g., as a column in a database table).
To make full use of UDTs, it would be helpful if the DBS provided performance at a level similar to that provided by DBS vendor developed system defined types, such as the INTEGER type. For example, it would be helpful in many database systems if UDTs were supported by such DBS-native operations as building a DBS index on a UDT, gathering statistics regarding a UDT, and redistributing base table rows on the basis of the UDT. Providing such services becomes more of a challenge when it is realized that UDTs may be Large Object UDTs (LOB-UDTs), such as Video UDTs, Audio UDTs, and Satellite Raster UDTs. For example, building an index based on an LOB-UDT, without dedicating large amounts of storage to copies of the LOB-UDTs is a significant challenge.
In general, in one aspect, the invention features a method for providing a database system native operation for a data element which has a user defined data type (UDT). The method includes assigning a data type, which is supported by the database system native operation, to an alias. The method further includes processing the data element to produce the alias, linking the alias to the data element, and performing the database system native operation using the alias.
Implementations of the invention may include one or more of the following. The database system native operation may be one of building an index based on the data element, gathering statistics regarding the data element, or redistributing rows of a base table based on the data element. Assigning a data type may include assigning an integer data type to the alias. The data element may include fields and processing the data element to produce the alias may include computing a checksum for a field of the data element. Linking the alias to the data element may include building a record which includes the alias and a pointer to the data element. Performing the database system native operation using the alias may include building an index using the alias, gathering statistics using the alias, or hashing using the alias.
In general, in another aspect, the invention features a computer program, stored on a tangible storage medium, for providing a database system native operation for a data element which has a user defined data type (UDT). The program includes executable instructions that cause the computer to assign a data type, which is supported by the database system native operation, to an alias. The program further includes executable instructions that cause the computer to process the data element to produce the alias, link the alias to the data element, and perform the database system native operation using the alias.
Implementations of the invention may include one or more of the following. The data element may include fields and, in processing the data element to produce the alias, the computer may compute a checksum for a field of the data element. In linking the alias to the data element, the computer may build a record which includes the alias and a pointer to the data element. In performing the database system native operation using the alias, the computer may build an index using the alias, gather statistics using the alias, or hash using the alias.
In general, in another aspect, the invention features a database system for accessing a database with a data element which has a user defined data type (UDT). The database system includes a massively parallel processing system, which includes one or more nodes. The database system includes a plurality of CPUs. Each of the one or more nodes provides access to one or more CPUs. The database system also includes a plurality of virtual processes. Each of the one or more CPUs provides access to one or more virtual processes. Each virtual process is configured to manage data stored in one of a plurality of data-storage facilities.
The database system includes a parsing engine. The parsing engine includes a session control, which controls access to the plurality of virtual processes. The parsing engine includes a parcer, which interprets and optimizes a command for access to a database to produce a series of steps to be performed. The parsing engine includes a dispatcher, which controls the sequence of the series of steps and determines which of the plurality of virtual processes will perform each of the series of steps. The parcer includes an optimizer which optimizes a plan for executing the series of steps. The optimizer includes a process for getting a type for an alias for the UDT data element; and a process for getting the alias for the UDT data element.
Implementations of the invention may include one or more of the following. The process for getting the type may return an integer type. The UDT data element may include a Binary Large Object (BLOB) and the process for getting the alias may include computing a checksum for a value of the BLOB. The UDT data element may include a ROW ID and a value and the optimizer may further include an indexer for indexing a record including the alias and the ROW ID into an indexing construct using the alias as the indexing key. The optimizer may further include a process for selecting a record matching a search UDT data element from an indexing construct including using the process for getting a type for the search alias for the search UDT data element, using the process for getting the search alias for the search UDT data element, and
retrieving one or more records from the indexing construct which are indexed by the search alias. The one or more retrieved records may include pointers to respective one or more UDTs. The process for selecting the record matching the search UDT data element may further include selecting the one UDT pointed to by the one or more retrieved records that best matches the search UDT.
The optimizer may include a histogram process for building a histogram using the alias, or a frequency search process using the alias. The UDT data element may include a ROW ID and a value. The optimizer may include a hashing process for hashing a record including the alias and the ROW ID using the alias as the hashing key.
One or more of the virtual processes may include a process for getting a type for an alias for the UDT data element, and a process for getting the alias for the UDT data element.
Other features and advantages will become apparent from the description and claims that follow.