1. Field of the Invention
The present invention generally relates to creation of queries against databases and, more particularly, to creation of queries that are suitable to identify relevant information from underlying databases.
2. Description of the Related Art
Databases are computerized information storage and retrieval systems. A relational database management system is a computer database management system (DBMS) that uses relational techniques for storing and retrieving data. The most prevalent type of database is the relational database, a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways. A distributed database is one that can be dispersed or replicated among different points in a network. An object-oriented programming database is one that is congruent with the data defined in object classes and subclasses.
Regardless of the particular architecture, a DBMS can be structured to support a variety of different types of operations for a requesting entity (e.g., an application, the operating system or an end user). Such operations can be configured to retrieve, add, modify and delete information being stored and managed by the DBMS. Standard database access methods support these operations using high-level query languages, such as the Structured Query Language (SQL). The term “query” denominates a set of commands that cause execution of operations for processing data from a stored database. For instance, SQL supports four types of query operations, i.e., SELECT, INSERT, UPDATE and DELETE. A SELECT operation retrieves data from a database, an INSERT operation adds new data to a database, an UPDATE operation modifies data in a database and a DELETE operation removes data from a database.
One difficulty when dealing with queries against databases is ensuring the validity and accuracy of query results that are returned from the databases in response to execution of the queries. Specifically, it must be ensured that no relevant information from the databases is missing in the returned query results. For instance, assume a query issued against a medical database of a hospital in order to identify patients having an undiagnosed condition, such as strep throat. Assume further that the symptoms of strep throat are fever, sore throat and body aches. Accordingly, a corresponding query can be issued against the medical database requesting information with respect to all patients having fever, sore throat and body aches in order to identify those patients having strep throat. An exemplary query is shown in Table I below, which, for simplicity, is described in natural language without reference to a particular query language.
TABLE IQUERY EXAMPLE001FIND002Name, Age003FROM004Diagnoses005WHERE006Body Temperature > 99.5° F. AND007Sore Throat = Yes AND008Body Aches = Yes
Illustratively, the exemplary query shown in Table I is designed to retrieve data records (lines 001-002) from a Diagnoses database table (lines 003-004) which satisfy all query conditions defined in lines 005-008. More specifically, the exemplary query of Table I is configured to retrieve name and age (line 002) of patients with a body temperature of more than 99.5° F., a sore throat and body aches (lines 006-008).
However, one or more strep throat patients may not show all three symptoms. For instance, several patients may have taken some aspirin before coming to the hospital so that their body temperature is less than 99.5° F. Accordingly, information with respect to these patients will not be retrieved by the exemplary query of Table I, as these patients do not satisfy all conditions defined in lines 006-008 of the exemplary query. Thus, the returned query result would be inaccurate due to a lack of relevant information.
Therefore, there is a need for an efficient technique for creating queries against databases which allows identification of relevant information therefrom.