A database query is a request for a set of information from at least one table in a database. Queries are well known in the art. Queries are written in Structured Query Language (SQL), which is the de facto standard computer language for querying, updating, and managing databases. In a SQL query, a user specifies the desired columns in the database and the desired filters for filtering out unwanted data rows. For example, a basic query looks like:
SELECT last_name, first_name, city, state, zipFROM addressesWHERE state=‘TX’The above query returns a list of addresses from the “addresses” table in a database. The returned list contains addresses for people living in Texas and lists their last name, first name, city, state, and zip code. In the example query above, the user has included a SELECT clause, a FROM clause, and a WHERE clause. The SELECT clause specifies the columns which the user desires from the addresses table. The FROM clause specifies that the query is to search the addresses table. The WHERE clause filters out the rows in the addresses table which do not contain TX in the state column. Persons of ordinary skill in the art are aware of how to write a query using SELECT, FROM, and WHERE clauses.
When the desired data is located in a plurality of tables, the query has to join together the plurality of tables in order to gather the requested data. For example, consider a database comprising three tables: table “addresses” containing the columns “customer_ID,” “last_name,” “first_name,” “city,” “state,” and “zip,” table “orders” containing the columns “customer_ID,” “order_no,” “quantity,” and “description,” and table “DVD_list” containing “last_name,” “first_name,” and “quantity.” Each table contains information stored in a plurality of rows. If an administrator wants an alphabetical listing of the customers who ordered DVD players and live in Texas, including the number of DVD players ordered, the query would be:
SELECT last_name, first_name, quantityFROM addresses AS AJOIN DVD_list AS D  ON A.customer_ID=D.customer_IDJOIN orders AS O  ON O.customer_ID=D.customer_IDWHERE A.state=‘TX’ AND O.description=‘DVD Player’ORDER BY 1As previously stated, the required information was not present in a single table. Therefore, the user wrote a query that joins the addresses table and the orders table into the DVD_list table. The above query contains a JOIN command, which joins one table to another. When the JOIN command is used without a modifier (i.e. OUTER JOIN or LEFT OUTER JOIN), the JOIN command is an inner join and includes only the rows that contain non-null data entries. The ON command specifies how the tables relate to one another and sets the joining criteria. In other words, the ON command specifies the joining conditions, which is how the rows in the two tables line up together when the tables are joined. The WHERE command specifies the filtering condition that determines which rows are desired in the output by filtering out the unwanted rows. The AND command is used when specifying numerous WHERE commands.
In the previous example, the user used an alias for each table in order to make the query easier to write and read. Aliases are recognized by the AS command. The aliases are A for the addresses table, D for the DVD_list table, and O for the orders table. The user also specified that the returned table should be sorted by the first column using the ORDER BY 1 command. Alternatively, the user could have accomplished the same result by substituting the last line of the query with ORDER BY last_name.
The query listed above is the more complex syntax for writing a query. Newer versions of SQL allow the user to write the same query in a simpler syntax:
SELECT last_name, first_name, quantityFROM addresses A, orders O, DVD_list DWHERE A.customer_ID=D.customer_ID AND O.customer_ID=D.customer_ID  AND A.state=‘TX’ AND O.description=‘DVD Player’ORDER BY 1The above query eliminates the need for the explicit JOIN, ON, and AS commands. The FROM command lists all of the tables which will be joined, separated by commas. The aliases for the tables are specified immediately after the table name in the FROM command. The joining conditions are placed in the WHERE command along with the filtering conditions. It is important when using the above syntax that the two types of WHERE clauses are distinguished from each other. The first two clauses in the WHERE clause above are the joining conditions and are referred to herein as (join) WHERE clauses. The last two clauses in the WHERE clause above are filtering conditions and are referred to herein as (filter) WHERE clauses. Persons of ordinary skill in the art can appreciate the difference between (join) WHERE clauses and (filter) WHERE clauses in a query.
When dealing with queries, the users frequently want to request a dynamic set of fields. In other words, the user would like to be able to create a SQL template which contains most of the necessary language for writing the query, but lacks the specific SELECT, FROM, JOIN, and WHERE clauses. An example of a SQL template is:                SELECT * FROM $F $J WHEREWhen the user dynamically creates the SQL query, the * is replaced by the columns in the SELECT clause, the $F placeholder is replaced by the generated FROM clause, the $J placeholder is replaced by the generated JOIN clause, and the (join) and (filter) WHERE clauses will be appended after the WHERE clause. Thus, when using these SQL templates, the administrator only needs to fill in the desired fields and filters to complete the query. The prior art method of creating the SQL template is to join together every database table and use WHERE clauses to filter out the unwanted information. For example, if an administrator has an order ID and wants the user's name and email address, the prior art query looks like:        
SELECT b.user_name, b.user_emailFROM Orders a, Users b, Addresses c, Products d, OrdersProductMap eWHERE a.user_id=b.user_id AND b.user_id=c.user_id ANDa.order_id=e.order_id  AND e.product_id=d.product_id AND a.order_id=“1234”The above query is preferred by SQL templates because all of the tables in the database have been joined together. The user need only enter the appropriate SELECT and WHERE clauses to obtain the desired output from the query.
However, the prior art method is inefficient in that it unnecessarily joins all of the tables together to obtain the desired information. A more efficient method of executing the query would only join the minimum amount of tables required to obtain the desired output. Using the above example, a query with only the necessary joins looks like:
SELECT b.user_name, b.user_emailFROM Orders a, Users bWHERE a.user_id=b.user_id AND a.order_id=“1234”Although the number of joins is reduced in the above query, it is not preferred by users because the user must manually alter the FROM clause and the WHERE clause in the SQL template every time a new query is developed. The prior art does not disclose an automated method for generating only the necessary table joins in both the FROM clause and the WHERE clause. Therefore, a need exists in the art for a method for creating a query in which only the necessary tables are joined together.
Limiting the number of joins in a query is important because running the query utilizes an excessive amount of computing resources. In other words, processing the query requires large amounts of processing power, computer memory, and time. Computer resources and time are valuable to database administrators who seek methods for decreasing costs and increasing available resources. Therefore, a need exists in the art for a method and system for reducing the time and resources required to process a query.
The prior art has previously addressed the need for decreasing the amount of time and resources required to process a query. U.S. Pat. No. 5,680,603 (the '603 patent) entitled “Method and Apparatus for Reordering Complex SQL Queries Containing Inner and Outer Join Operations” discloses a method which translates the query into a hypergraph representation and generates required sets, conflict sets, and preserved sets for the hypergraph. Using the required sets, the '603 patent enumerates a plurality of plans which represent associative reorderings of relations in the query. The SQL operators are selectively assigned to each of the enumerated plans using the conflict sets and/or preserved sets, so that the results from the plans are identical to the original query. While the method disclosed in the '603 patent reduces the execution time of a query, it does not address the problem of limiting the number of table joins in the query. Therefore, a need exists for a method of reducing the execution time of a query in which only the necessary tables are joined together.