The present invention relates generally to database management systems and, more particularly, to a generic, platform independent, and extensible technique for graphically presenting the evaluation of a query in a database management system.
Information is frequently stored in computer processing systems in the form of a relational database. A relational database stores information as a collection of tables having interrelated columns and rows. A relational database management system (RDBMS) provides a user interface to store and retrieve the information and provides a query methodology that permits table operations to be performed on the data. One such query methodology is the Structure Query Language (SQL) interface, which permits users to formulate operations on the data tables either interactively, or through batch file processing, or embedded in host languages such as C, COBOL, and the like.
In general, SQL provides table operations with which users can request database information. The table operations are specified in SQL statements or queries. An example of an SQL statement or query is provided below in Table 1:
One of ordinary skill in the art will understand that xe2x80x9cMgrtabl.Namexe2x80x9d indicates a Name column for Table Mgrtabl. Depending on the RDBMS platform, SQL statements will vary because the RDBMS platforms are defined differently and/or may offer different functionality.
While a query allows the user to specify the data desired, it does not detail how the data will be retrieved from the tables. In general, there may be several ways to retrieve the data that answers a given query. Accordingly, a query optimizer is utilized in the RDBMS to determine the most cost-effective way to retrieve the data.
FIG. 1 is a high-level block diagram of query processing in a query compiler that could be used with the present invention. As is shown, an input SQL query 300, which can be an actual query or some other command, is typically rewritten by a parser 310 into a RDBMS internal representation, which can then be converted into a query graph model (QGM) 320. Conceptually, the QGM 320 is a high-level, graphical representation of a statement, such as the input query 300, in which boxes represent relational operations and arcs connecting the boxes represent quantifiers that reference tables. It should be noted that not every query optimizer uses the QGM 320 to represent a statement. Other RDBMS platforms might, for instance, convert the input query 300 into a proprietary format different from a QGM 320. Nevertheless, as will be discussed below, the format of the RDBMS internal representation is not material to the present invention. Thus, for purposes of this discussion, the QGM 320 is illustrative rather than limiting.
The internal representation of the input query 300, in this case the QGM 320, undergoes various refinements during query processing. For instance, Query Global Semantics 330 and Query Rewrite Transform 340 are two typical processes that refine the QGM 320. At each stage of query processing, the evolving QGM 320 represents a transitional statement or plan (referred to as xe2x80x9cQGM transition plansxe2x80x9d), which can be written to a query EXPLAIN table 350, so that a developer can evaluate the QGM transition plans 320 at intermediary stages of query processing.
After the QGM 320 is refined, cost-based plan optimization 360 is performed whereby the QGM 320 is graphically traversed and a plurality of query execution plans (QEPs) are generated. The QEPs represent alternative ways of processing the data requested in the input query 300. Each QEP is compared to determine which of them is the most cost-efficient. The most cost-efficient query plan is referred to as the best QEP 370 or BQEP. The cost-based optimization process 360 can be performed using a number of different techniques, as is well known to one skilled in the art. For purposes of this discussion, however, the method utilized for cost-based optimization 360 is not material, and therefore, requires no further elaboration. After the BQEP 370 is selected, it is further refined by a threaded code generator 380. The threaded code generator 380 creates a plan section 390 of executable code for each input SQL statement 300. The plan section 390 is then executed by the RDBMS at run time.
FIG. 2 illustrates a graphical representation of the BQEP 10 of the query in Table 1 above. As is shown, the BQEP 10 can be represented in a work-flow graph of operators, in which each node 12a-12k of the graph represents either a source database object, e.g. a table, containing rows and columns in an RDBMS (12g, 12j, 12k), or some operation to the data (12a-12f, 12h, 12i, 12j). Each arc of the graph 14a-14j represents the flow of data between two or more operations, or between an object and an operation. Thus, a path in this graph represents the flow of data from its raw form stored in the database to the finished form requested by the user in her query.
While the graphical representation in FIG. 2 is helpful to the user in visualizing and evaluating the access path of a query, unless specifically requested, the query optimizer does not generate a graphical representation of the BQEP or any other plan. Rather, the plan section (FIG. 1, item 390) is generated in a proprietary format of the RDBMS, which is typically not user readable. If requested by the user, the BQEP 370 can be written to one or more tables, collectively known as an EXPLAIN table (FIG. 1, item 380). Thus, in order to evaluate the BQEP, the user must analyze the EXPLAIN table 380. This can be a daunting task because the EXPLAIN table 380 is not an intuitive, i.e. user friendly, representation of the BQEP 370, especially if the BQEP 370 is complicated. Accordingly, evaluation of the BQEPs 370 based on the EXPLAIN table 380 is tedious and time consuming.
Moreover, while input queries 300 are typically written in SQL, the resulting plans, e.g., QGM transition plans, query execution plans and the BQEP 370, are platform dependent. The major database management system platforms supporting SQL, including DB2, SQL/DS, Oracle, Sybase, SQLbase, INFORMIX and CA-Openlngres (Ingres), each have their own enhancements, quirks and tricks, such that plans generated by each platform vary from one platform to the other based on platform specific formats. Therefore, any application that reads plans from different database management system platforms would be required to be customized in order to understand the plans. This type of maintenance is error-prone and unmanageable.
Accordingly, a need exists for a method and system for representing plans, including QGM transition plans and QEPs, in a graphical manner, such that the user can visualize the plans in a comprehensive manner. The method and system should be platform independent, so that plans generated by different database management system platforms can be presented to the user without regard to the system platform. The present invention addresses such a need.
A platform independent method and system for graphically representing a plan, including a query execution plan, for a query in a relational database management system is disclosed. The method includes receiving and processing an input query to form a plurality of plans, selecting at least one plan of the plurality of plans, and transforming the selected plan into a self-describing formatted file which is platform independent. The method further includes generating a graph representing the selected plan from the self-describing formatted file.
Through aspects of the present invention, any plan, including a best query execution plan for a query and QGM transition plans, generated from any RDBMS platform is represented as a self-describing formatted file, which is then used to generate a graph representing the plan. Because the self-describing formatted file is independent of the RDBMS platform, a graphing tool, which reads the self-describing formatted file, need not understand or differentiate between platforms. In essence, the RDBMS becomes transparent to the graphing tool. Thus, application software updates to the graphing tool in order to accommodate new database system platforms are eliminated. Moreover, because the present invention utilizes a self-describing file (i.e., extensible), nuances from each platform can be captured and conveyed to the graphing tool.