1. Field of the Invention
The present invention relates to systems and methods for performing queries on data stored in a database, and in particular to a method and system for providing access to an array-based data object to a client.
2. Description of the Related Art
Large-scale integrated database management systems provide an efficient, consistent, and secure means for storing and retrieving vast amounts of data. This ability to manage massive amounts of information has become a virtual necessity in business today.
At the same time, wider varieties of data are available for storage and retrieval. In particular, multimedia applications are being introduced and deployed for a wide range of business and entertainment purposes, including multimedia storage, retrieval, and content analysis. Properly managed, multimedia information technology can be used to solve a wide variety of business problems. In some cases, the data objects to be stored, retrieved, and manipulated are quite large. Such data objects include, for example binary large objects (BLOBs), character large objects (CLOBs), video, audio, images, and text.
By virtue of their sheer size, these large data objects can be difficult to manage. Object relational database systems, for example, store information as a collection of tables. Each table is a set of tuples, and each tuple is an ordered list of attributes. Each of these attributes has a type. An object-relational database system allows these types to include complex types such as text, video, images, and spatial data. To perform rapid searches, it is useful to build an index for these complex data types. For example, to answer a query that retrieves all documents that have the words xe2x80x9cfoolxe2x80x9d and xe2x80x9cgoldxe2x80x9d in it, it would be very useful to have a text index built on the text documents. Such an index would allow the search to be answered efficiently, without requiring that each text document of the database be retrieved.
A traditional index in a database system accepts a value as an argument and returns a list of tuple identifiers. However, tuple identifiers are insufficient to allow use of a database index with complex data types. From the foregoing, it is apparent that there is a need for a system that will allow for efficient indexing and retrieval of complex data types from an object-relational database. The present invention satisfies that need.
To address the requirements described above, the present invention discloses a method, apparatus, and an article of manufacture for providing access to abstract data types (ADTs) using an index providing a tuple.
The method comprises the steps of accepting a database query; generating an index predicate from the database query; and determining a tuple from an index using the index predicate. The tuple is associated with an abstract or complex data type responsive to the database query. A data stream is initialized with the index predicate; and the tuple is returned in the data stream. The apparatus comprises means for performing the above method steps, and the article of manufacture comprises a medium tangibly embodying computer instructions for performing these method steps.
With complex data types such as text, the index of the present invention accepts a value as an argument and returns a set of tuples, not just the tuple identifiers that are used with simple data types. Each of such tuples includes a list of values that must be conveyed in order to provide the information from the index to respond to the query. The presentation of the tuple sets, rather than merely tuple IDs presents a difficult problem. The present invention solves this problem with the use of a virtual index data stream. The database engine initializes the stream with the index predicate (i.e. find documents with the words xe2x80x9cfoolxe2x80x9d and xe2x80x9cgoldxe2x80x9d). The stream then starts returning back a tuple for each of the complex data types having data responsive to the query. The individual values in the tuples can be viewed as ordinary tuples that are stored on disk as tables or relations. In one embodiment, the database engine manipulates and processes the sets of tuples obtained from the virtual index stream, thus allowing the index to appear like a relation to the remainder of the database engine. This feature is useful for extensibility, since it allows new index modules to be plugged in, while appearing like a relation to the rest of the database engine. dr
Referring now to the drawings in which like reference numbers represent corresponding parts throughout:
FIG. 1 is a block diagram showing an exemplary environment for practicing the present invention;
FIG. 2 is a diagram showing one embodiment of the user front end of the exemplary hardware environment depicted in FIG. 1;
FIG. 3 is a diagram illustrating a relationship between a database table having non-ADT data and an index for the database table;
FIG. 4 is a diagram illustrating a relationship between a database table including complex data types such as objects and an index for the database table;
FIG. 5 is a flow chart illustrating exemplary process steps used to practice one embodiment of the present invention;
FIG. 6 is a diagram showing a data stream created by the process steps shown in FIG. 5; and
FIG. 7 illustrates an exemplary computer system that could be used to implement elements of the present invention.