To access the data contained in databases, such as an SQL (structured query language) database, statements (that includes queries, insert, updates, and deletes) are used by a database management system (DBMS). A collection of these statements which are to be executed by the DBMS against the database is known as a workload. Database characteristics such as the database structure and system configuration can impact the performance of workload when executed against the database.
Database analysis tools have been developed which aid in the selection of database characteristics such as materialized views, indexes, multi-node partitions and multi-dimensional clusters that will improve the workload performance against the database.
Individual analysis tools are utilized to improve workload execution relative to a performance metric by aiding in the selection of improved database characteristics. The individual database characteristics are provided as design recommendations for the modification of the database's structure. Characteristics such as materialized views and indexes, multi-dimensional clustering and multi-node partitioning can be individually implemented to improve the design of a database thus improving the execution cost of the workload. These analysis tools are implemented independently and only provide individual design recommendations. The design recommendations provided by analysis tools are generated independently of one another and do not take into account overall execution costs of the analysis tools or the impact of previously selected characteristics. In addition, the interaction or dependencies between the characteristics selected by individual analysis tools are not accounted for, and can have a significant impact on the resulting database characteristics design recommendations and utilization of resources.
In order to reduce the cost of executing analysis tools based upon metrics such as, for example, execution time, compression techniques have also been developed for compressing workloads by reducing a large set of statements to a smaller set of statements which adequately represent the overall workload. Typically, the more complex a statement in a workload is, (i.e. the higher execution cost) the more benefit will be obtained in terms of execution cost of the analysis tool by utilizing a compressed workload. In general, the analysis tool should concentrate upon those statements that contribute the most to the total time to execute a workload, i.e. the statements that currently incur the highest execution cost. Even though the analysis tool might be able to improve lower cost statements or queries by a significant percentage, such improvements will have minimal impact on the overall absolute cost to execute the given workload, and so the statements that have lower execution cost can be omitted in the compressed workload to reduce the execution time of the analysis tool. Compression may be implemented prior to execution of the analysis tool in order to decrease execution time.
Therefore, there is a need for a method, system and computer program product that selects database characteristics such as materialized views and indexes, multi-dimensional clusters and multi-node partitions for a database that will result in an a reduced workload execution cost.