1. Technical Field
The present invention is generally related to computer databases, and more particularly to computer database querying analysis.
2. Description of the Related Art
There are many known software systems and methods for querying a database. One of the most popular software query languages is known as SQL (or Structured Query Language.) SQL provides a framework for logically structuring complex conditional expressions that can be used to query a database. SQL includes many different types of logical constructs, including the WHERE clause, the HAVING clause and the ON clause. A WHERE clause is typically structured as follows: WHERE (variable 1  less than operator greater than  condition 1) link (variable 2  less than operator greater than  condition 2). The WHERE clause then returns data records from the database that meet the two conditional expressions (variable 1  less than operator greater than  condition 2) link (variable 2  less than operator greater than  condition 2), depending on the type of link. Two common forms of logical links for conditional expressions are the xe2x80x9cANDxe2x80x9d link and the xe2x80x9cORxe2x80x9d link.
For example, consider a database containing personnel records for a company. Each employee""s data record may include variable fields for storing salary and age. A user may then query the database to find those employees that are older than 35 and make less than $50,000 by forming the SQL query: WHERE (age  greater than 35) AND (salary  less than 50,000). Here, age is variable 1, xe2x80x9c greater than 35xe2x80x9d is condition 1, salary is variable 2, and xe2x80x9c less than 50,000xe2x80x9d is condition 2. The logical link operator is the AND link.
When an SQL SELECT statement contains many query conditions that are interconnected with each other through Boolean logic, the overall retrieval results might be unexpected. A user might be expecting to see many records retrieved when only a few or no records were actually retrieved. Current database querying approaches do not effectively analyze the data flow within a database query so as to detect such problematic situations.
In the above SQL query example, the query condition xe2x80x9c(age  greater than 35)xe2x80x9d may have such a restrictive effect upon the overall query that it acts to block all records. If it blocks all records, then the other query condition xe2x80x9c(salary  less than 50,000)xe2x80x9d is rendered insignificant since the ANDing of the conditions will result in zero records due to the xe2x80x9cagexe2x80x9d condition blocking all the records. Current approaches do not detect where in the database query the problem is occurring. The present invention overcomes these disadvantages as well as other disadvantages.
In accordance with the teachings of the present invention, a computer-implemented method for analyzing a database query is provided. The database query contains a plurality of query conditions that are used to filter data records of a database. The query data flow analysis method identifies at least one query condition from the plurality of query conditions in the database query. The database is queried based upon the identified query condition. At least one results characteristic is determined that is associated with the query of the database with the identified query condition. The results characteristic is used to analyze the identified query condition.
According to another aspect of the invention, a computer-implemented system is provided for analyzing database queries. The database query contains a plurality of query conditions that are used to filter data records of a database. A query parser module identifies at least one query condition from the plurality of query conditions in the database query. A query condition executor module connected to the query parser module performs a query of the database based upon the identified query condition. A results analyzer module connected to the query condition executor module determines at least one results characteristic associated with the query of the database by the identified query condition. A results data structure connected to the results analyzer module stores an association between the identified query condition and the results characteristic. The results characteristic is used to analyze the identified query condition.
It should be noted that these are just some of the many aspects of the present invention. Other aspects not specified will become apparent upon reading the detailed description of the preferred embodiment set forth below.