1. Field of the Invention
The invention generally relates to computer database systems. More particularly, the invention relates to a database abstraction model constructed over an underlying physical database, and to a database query application used to generate queries of the underlying physical database from a query of the database abstraction model.
2. Description of the Related Art
Databases are well known systems for storing, searching, and retrieving information stored in a computer. The most prevalent type of database used today is the relational database, which stores data using a set of tables that may be reorganized and accessed in a number of different ways. Users access information in relational databases using a relational database management system (DBMS).
Each table in a relational database includes a set of one or more columns. Each column typically specifies a name and a data type (e.g., integer, float, string, etc), and may be used to store a common element of data. For example, in a table storing data about patients treated at a hospital, each patient might be referenced using a patient identification number stored in a “patient ID” column. Reading across the rows of such a table would provide data about a particular patient. Tables that share at least one attribute in common are said to be “related.” Further, tables without a common attribute may be related through other tables that do share common attributes. A path between two tables is often referred to as a “join,” and columns from tables related through a join may be combined to from a new table returned as a set of query results.
Queries of a relational database may specify columns to retrieve data from, how to join the columns together, and conditions (predicates) that must be satisfied for a particular data item to be included in a query result table. Current relational databases require that queries be composed in complex query languages. Today, the most widely used query language is Structured Query Language (SQL). However, other query languages are also used. An SQL query is composed from one or more clauses set off by a keyword. Well-known SQL keywords include SELECT, WHERE, FROM, HAVING, ORDER BY, and GROUP BY. Composing a proper SQL query requires that a user understand both the structure and content of the relational database as well as the complex syntax of the SQL query language (or other query language). The complexity of constructing an SQL statement, however, generally makes it difficult for average users to compose queries of a relational database.
Because of this complexity, users often turn to database query applications to assist them in composing queries of a database. One technique for managing the complexity of a relational database, and the SQL query language, is to use database abstraction techniques. Commonly assigned U.S. patent application Ser. No. 10/083,075 (the '075 application) entitled “Application Portability and Extensibility through Database Schema and Query Abstraction,” discloses techniques for constructing a database abstraction model over an underlying physical database.
The '075 application discloses embodiments of a database abstraction model constructed using logical fields that map to data stored in the underlying physical database. Each logical field defines an access method that specifies a location (i.e., a table and column) in the underlying database from which to retrieve data. Users compose an abstract query by selecting logical fields and specifying conditions. The operators available for composing conditions in an abstract query generally include the same operators available in SQL (e.g., comparison operators such as =, >, <, >=, and, <=, and logical operators such as AND, OR, and NOT). Data is retrieved from the physical database by generating a resolved query (e.g., an SQL statement) from the abstract query. Because the database abstraction model is tied to neither the syntax nor the semantics of the physical database, additional capabilities may be provided by the database abstraction model without having to modify the underlying database. Thus, the database abstraction model provides a platform for additional enhancements that allow users to compose meaningful queries easily, without having to disturb existing database installations.
For example, researchers often wish to select patients that have a specific family history. This is often the case in preventative studies. For example, a researcher may wish to identify individuals to participate in a test for a drug that seeks to prevent the first instance of a heart attack (or other disease or condition) based on family history. One reasonable way to test this effectively, while also using a small sample of individuals, is to use individuals that have a very high likelihood of experiencing a heart attack. Although medical institutions keep extensive data about relationships between patients, it may be difficult to compose a query that identifies patients with a specific family history. That is, composing a query that retrieves data about a first patient based on conditions applied only to related patients requires a sophisticated SQL query. More generally, it is difficult to compose an SQL query to identify instances of an entity (e.g., a patient) based on conditions evaluated against related entities (e.g., a patient's parents). At a minimum, doing so requires that a user (i) query to identify instances of the entity; (ii) query to find related instances of the entity; (iii) select related instances satisfy the desired conditions; and (iv) join these results to the original instances of the entity. Accordingly, it would be useful to enhance the database abstraction model to allow users to compose queries that specify conditions evaluated against related entities.