This invention relates to databases and more particularly, to using a correlation operator "THgr" to correlate sub-queries.
Database methodologies, developing for decades, now include relational, object-oriented, and heterogeneous types. Heterogeneous databases have arbitrarily structured records.
This application is commonly assigned with U.S. patent application Ser. No. 6,167,393 to Davis, III et al.
Ubiquitous personal computing, collaborative group computing, and highly integrated, distributed environments create new demands on databases. Databases store information, that is most useful when retrieved completely, reliably, and quickly.
Records (or combinations of records) in a database generally represent an object in the real world (a product, a customer, an employee, a business division, etc.). As such, a record typically consists of a collection of fields that represent attributes of the object. This collection of fields is not necessarily xe2x80x9ccomplete,xe2x80x9d but has been deemed sufficiently useful to describe the object and distinguish it from any other object represented in the database. Ultimately, the contents of these fields are the information that distinguish one object from another.
By way of example, traditionally, databases use schema to define record xe2x80x9ctypesxe2x80x9d or object classes. In such databases, a record type (or object class) is an abstraction or generalization about the collection of records in the database that represents the same xe2x80x9ckindxe2x80x9d of real world object. As such, a record xe2x80x9ctypexe2x80x9d may be thought of as xe2x80x9cmeta-data,xe2x80x9d or xe2x80x9cdata about data.xe2x80x9d A record type typically defines certain relevant attributes and/or behaviors that are to be found in instances of that record type. For example, the record type xe2x80x9cpersonxe2x80x9d may specify that a xe2x80x9cpersonxe2x80x9d record contains attributes of height, weight, hair color, phone number, etc. The set of xe2x80x9cpersonxe2x80x9d records in the database is homogeneous in that each record contains exactly the same set of attributes (those that are defined in the xe2x80x9cpersonxe2x80x9d record type).
As an alternative to relational database structures, arbitrarily structured records can be used. In an arbitrarily structured record, repeating fields, missing fields, null-valued fields, and sub-record entities may exist. A database containing arbitrarily structured records presents numerous difficulties for a query engine designed to locate records within the database. An arbitrarily structured record might also include more than a single field having the same field identification or field name.
Internal self-description exists within arbitrarily structured records since no over-riding schema need exist, as compared with relational databases. Thus, data may not always be cleanly divided into homogenous tables, each having a single schema (record template), as required by the relational database model.
For example, a business organization may have some substantial structuring. Nevertheless, an address book might regard every company entity (e.g., company, division, department, unit, individual, etc.) as a contact, customer, client, or the like. Such a universal address book may need to accommodate all entities possessing an address and a phone number regardless of other attributes.
In searching a database, whether the database is relational, object-oriented, arbitrarily structured, or of some other type, data may need to be correlated. It may be necessary to identify a record in the database for which two properties exist, or two records that share a common property. For example, in searching a database of vehicles and their owners, a user may need to identify all owners who own a particular vehicle combination of make, model, color, year, etc. Or a user may need to identify all owners who own a particular combination of vehicles. A database can be manually correlated, but the process is cumbersome, requiring a high degree of skill from the user. Any user lacking the needed skill would be unable to complete the desired search.
A correlation operator "THgr" requires that the results of the sub-queries it has for an operand be correlated. The correlation operator "THgr" has an implied existential quantifier property (i.e., a xe2x80x9cfor somexe2x80x9d property) and is satisfied if any record matches its sub-query. If no record is found that matches the correlation operator "THgr"""s sub-query, then the correlation operator "THgr" query fails. The implicit existential quantifier property of the correlation operator "THgr" can be converted into a universal quantifier property (i.e., a xe2x80x9cfor allxe2x80x9d property) by transformation of the query.
The invention further includes an algebra for resolving the correlation operator "THgr" into flags attached to appropriate operators in the sub-query. After the correlation operator "THgr" is eliminated from the query, the query can be performed.
The correlation operator "THgr" can be inserted explicitly by the user if she knows what data she desires to be correlated. Alternatively, the correlation operator can be inserted by the schema automatically to improve the expected results of the query.