One conventional approach to estimating the cost of executing a structured query language (SQL) query, prior to actual execution of the query, is to save the historical execution costs associated with previous SQL queries. Then, when a new SQL query is submitted for consideration, that new SQL query is compared to the previous SQL queries in order to identify a “similar” previous query. The historical execution cost for that “similar” previous query is then used as the estimate for the new query. More specifically, the comparison involves a direct comparison of the SQL statements themselves.
While this approach sometimes is useful where a new SQL query is exactly identical to a previous SQL query (in terms of both structure and data), the present inventors have discovered that it often produces unreliable results when the supposed “similar” previous query statement is only slightly different than the current query statement (e.g., a difference in a single data field). As a result, such an approach often is not very useful for the extremely complex one-of-a-kind SQL statements common in data-warehouse/business-intelligence systems (some of which essentially being programs consisting of hundreds of lines of SQL code).