1. The Field of the Invention
The present invention relates generally to computer-implemented database systems. More specifically, the present invention relates to a system and method for filtering query statements from multiple plans and packages according to user-defined filters of query explain data.
2. The Relevant Technology
Databases are computerized information storage and retrieval systems. A Relational Database Management System (RDBMS) is a database system which uses relational techniques for storing and retrieving data. Relational databases are organized into tables consisting of rows (tuples) and columns of data. A database typically includes many tables, and each table includes multiple rows and columns. The tables are conventionally stored in direct access storage devices (DASD), such as magnetic or optical disk drives, for semi-permanent storage.
Generally, users communicate with an RDBMS using a Structured Query Language (SQL) interface. The SQL interface allows users to create, manipulate, and query a database by formulating relational operations on the tables, either interactively, in batch files, or embedded in host languages such as C and COBOL. SQL has evolved into a standard language for RDBMS software and has been adopted as such by both the American National Standards Institute (ANSI) and the International Standards Organization (ISO).
The SQL standard provides that each RDBMS should respond to a particular query in the same way, regardless of the underlying database. However, the method that the RDBMS actually uses to find the requested information in the database is left to the RDBMS. Typically, there is more than one method that can be used by the RDBMS to access the requested data. The RDBMS, therefore, attempts to select the method that minimizes the computer time and resources (i.e. cost) for executing the query.
The RDBMS determines how to execute the SQL statements. The set of steps created by the RDBMS for executing the SQL statements is commonly referred to as the “access path.” In other words, the access path is a sequence of operations used by the RDBMS to obtain the data requested by the SQL query. Depending on the access path, an SQL statement might search an entire table space, or, alternatively, it might use an index. The access path is the key to determining how well an SQL statement performs. The description of the access path is stored in a plan table, which typically stores the access path data for a plurality of SQL statements.
In addition to determining the access path, some databases estimate the cost for executing each SQL statement. The estimated costs are typically stored in a statement table (as in the case of DB2® for OS/390®) or a similar table within the database. Like the plan table, the statement table stores the estimated statement costs for a plurality of SQL statements.
Databases also typically include statistics for such database objects as table spaces, indexes, tables, and columns. For example, in the case of a table, the statistical data may include the number of pages that contain rows of the table, the number of rows and columns in the table, as well as various other statistical data. The statistics are typically derived from the RDBMS “catalog,” which is an object that describes the entire database.
The above-described access path data, statement cost data, and object statistics data (referred to collectively as “query explain data”) assist the user in analyzing and improving the performance of SQL statements. For example, a query with a higher-than-average statement cost might alert the user to inefficiencies in the access path. By viewing the access path data, the user can selectively make changes to the query and/or the database, such as by adding an index in order to avoid a table space scan. The object statistics data similarly assists the user by describing the structure and organization of the database.
Unfortunately, analyzing SQL query performance is currently too time-consuming and complex for the average user. Typical systems include hundreds or thousands of query statements. Identifying the one or more statements that need to be improved can be a difficult task. In conventional systems, the user must manually locate the relevant query explain data in a plurality of tables, e.g. the plan table, the statement table, the function table, and the catalog tables, which can be tedious and time-consuming.
Moreover, each of the tables typically stores information corresponding to many different statements and objects. Consequently, the tables are often very large, making it difficult to locate the desired data. Likewise, the tables are often cryptic and hard to understand, even for database experts. For example, the plan table typically includes sixty or more columns and hundreds or thousands of rows. The access path data is stored in a tabular format, which, although easily understood by the RDBMS, is often too complicated to be effectively analyzed.
Furthermore, the query statements to be filtered are typically included in a number of packages and plans. A package is a collection of query statements found in a single application program. A plan is also a collection of query statements, but may include statements from one or more application programs. In large-scale database systems, the number of plans and packages is correspondingly large. Thus, a difficulty arises when trying to locate a particular package or plan for purposes of analyzing the query statements contained therein.
Accordingly, what is needed is a system, method, and article of manufacture for selectively filtering, reducing, or otherwise focusing a list of query statements based on a user's customized data requirements for improving SQL performance. Additionally, what is needed is a system, method, and article of manufacture for filtering query statements according to user-defined filters of query explain data. What is also needed is a system, method, and article of manufacture for generating user-defined filters of object reference data, statement cost data, and access path data. Moreover, what is needed is a system, method, and article of manufacture for storing and retrieving sets of user-defined filters. In addition, what is needed is a system, method, and article of manufacture for selectively filtering, reducing, or otherwise focusing a list of packages or plans according to user-defined filtering criteria.