In database systems, to query a database, a query is issued. A query is normally encoded in a query language, such as Structured Query Language (“SQL”). If a query is for data from more than one table, or selected as a function of data stored in one or more tables other than a table data is to be selected from, one or more table joins need to be made. Table joins are made on key columns of data shared between two or more tables. However, table joins are often expensive database transactions in terms of time and database management systems utilized to make joins.
There exist for some database management systems, such as database management systems available from Teradata, query generator tools. These tools receive as input, database table-column pairs for data a user wishes to obtain from the database. The query generator builds a query to fulfill the users request and typically returns a SQL statement string. It is common for queries generated using a query generator to invoke table joins to obtain the data.
Specifying the table(s) for non-key columns is normally a fairly trivial task. Normally, each piece of non-key column data is available from only one table, or a small number of tables in the case of summarized data. The one table, or limited number of tables, are usually known to the calling application or end-user along with the columns.
However, specifying the table or tables for key columns is a much more difficult task. A key column can legitimately appear in many tables. For instance, in a customer relationship management oriented database, a column INDIVIDUAL—ID key may appear in as many as half of the tables in the database. From an application's point of view, there is usually not a single, straightforward method of selecting the “best” instance of the data item.
Some applications and end-users issue SQL statements directly to a database, while others submit table and column pairs of data items to retrieve to a query generator. The query generator will then build a SQL statement, which when executed, will retrieve the data. However, the directly issued SQL statement and the table and column pairs submitted to a query generator often will not specify the “best” table to obtain the data from. Best may be considered to be a table chosen to minimize a number of joins necessary to fulfill a query. Often, unnecessary table joins are made.