1. Technical Field
This invention generally relates to computer systems, and more specifically relates to apparatus and methods for querying a database.
2. Background Art
Since the dawn of the computer age, computers have evolved and become more and more powerful. In our present day, computers have become indispensable in many fields of human endeavor including engineering design, machine and process control, information storage and retrieval, and office computing. One of the primary uses of computers is for information storage and retrieval.
Database systems have been developed that allow a computer to store a large amount of information in a way that allows a user to search for and retrieve specific information in the database. For example, an insurance company may have a database that includes all of its policy holders and their current account information, including payment history, premium amount, policy number, policy type, exclusions to coverage, etc. A database system allows the insurance company to retrieve the account information for a single policy holder among the thousands and perhaps millions of policy holders in its database.
Retrieval of information from a database is typically done using queries. A query usually specifies conditions that apply to one or more columns of the database, and may specify relatively complex logical operations on multiple columns. The database is searched for records that satisfy the query, and those records are returned as the query result.
One problem with using queries to retrieve information from a database is that using queries typically requires specialized knowledge of a query language, such as Structured Query Language (SQL), as well as detailed knowledge of the database and its relationships. There are many applications where a person needs to query a database, but does not have the detailed knowledge of a query language or the details of the database. Some efforts have been made to provide a graphical query interface that allows a person that does not know SQL to query a database. The main focus of these known graphical query interfaces is abstracting the database and providing an easy-to-use interface for building queries. One problem with these known graphical query interfaces is a user can construct queries that are not very meaningful because they return no data, or because they return thousands or millions of records. Because the graphical query interface abstracts the details of the database to the user, the user has no idea whether two tables might represent disjoint sets of data. As a result, the user receives no feedback from known graphical query interfaces regarding the quality of the query until the query is completely built and then executed. If the size of the dataset is too large or too small, the user has no information regarding relationships in the database that allow the user to modify the query to return an acceptable dataset. The result in the prior art is the generation of queries that are not terribly useful because they return a dataset that is too large or too small to be useful.
Another problem with queries is that a user that builds a query may not know relationships between columns in the database, which could result in building a query that includes conflicting columns. For example, in a medical database, one could build a query to return all male patients that had a positive pregnancy test. Of course, such a query is nonsense and will return no records. While this example query would probably never be run, because a user can easily visually determine that a male person could not be pregnant, there are many other relationships between columns in a database that are much more subtle, and may even be unknown to the user. For example, in a medical database, there may be specimen data that should only be accessed if the data is anonymous. However, one could easily build a query using known tools that include patient information and specimen data, thereby violating the anonymity rule. Without a way to generate queries in a way that provides an indication of the quality of the query before the query is executed, and to build a query using a tool that shows restrictions between columns in a graphical representation of the query, the computer industry will continue to suffer from the generation and execution of queries that do not return a useful dataset, or that violate predefined rules.