A data base may be defined as an integrated set of data that model a given universe. The data used by various applications are grouped in the same base in such a way as to avoid the problems of data duplication. Associated with a data base is a conceptual schema that describes the structure and type of the data that it contains and the constraints, which must always be verified. The data base administrator has the task of making sure of the verification of these constraints.
The data administration has three roles. First, it defines the data base users by assigning them a name, a password, and a set of privileges or rights for access to the data. It also defines the administrator or administrators, who are the persons authorized to handle all or part of the administration of the data base. The second role of the administration is to assign to the administrator, or cause him to participate in, the definition of the conceptual schema of the data base. The administrator thus defines the schemas of the various tables and the rules relating to these tables. In his final role, the administrator has possibility of benefiting, in terms of performance, integrity and security, from the conceptual schema of the base and of the RDBMS. It is this third role upon which the subject of the present invention essentially bears.
When an RDBMS application is being developed, the role of an administrator is to optimize the use of the RDBMS's resources. When a query is made at the time of the application, the RDBMS chooses an access route for executing the query. To optimize the performance of the application, the administrator must analyze the queries sent and decide on the action to be taken. His action may be to change the form of the query, and/or to create or delete indexes in the tables concerned. His decision depends on the information that indicates to him whether or not the current form of the query is adequate, or whether the existing indexes are pertinent.
The indexes are physically independent of the data and may thus be deleted and created at any moment. An index is used at the time of access to the data, regardless of who created it. A table may have several indexes at the same time. A compromise must be made between the number of indexes and the overhead involved in updating them. An index may apply to more than one column, and in that case it can assure the uniqueness of the columns used. It also improves the performance of the queries for which the selection condition will affect more than one column. A query with more than one predicate can use multiple indexes, if they are unique, if the predicates are equalities and are defined in the same table. In that case, the partial results from each index are merged to constitute the definitive result. Some RDBMSs are capable of using the properties of indexes and the characteristics of columns to decide which index to take and which to avoid for the merge operation. Whether or not to use an index is decided by the RDBMS optimizer. The decision varies depending on the type of RDBMS.
On the other hand, the schema of an RDBMS may evolve dynamically. For example, one or more fields may be added to a table, and indexes may be created in a table. In this context, it may happen that the RDBMS does not encounter an index that it must use, or encounters an index that it must not use, for example if access is being gained to more than 10% of the articles. In this latter case, the index becomes a hindrance. Given all the possibilities in choice of the RDBMS to set up its plan, it is estimated that the utilization of the resources can vary within a ratio of 1:10,000, depending on the formula of the query and the choice of indexes.
At present, optimizing a query is complicated. First, it is not automatic, and specific RDBMS tools must be used to request the plan of execution of the query. This plan is the concatenation of operations to be executed by the data base server in order to process the query. To request this plan, the user must write the query in the language of the RDBMS. Hence, writing the query requires good knowledge of the query language of the RDBMS. Moreover, it is rigorous and arduous work and involves a great loss of time, aggravated by the risks of mistakes that ensue. In response, the RDBMS furnishes the results of the request of the execution plan of the query, in a variable form depending on the RDBMS. The plan is written in the query language of the RDBMS and is currently present in the form of a long table with multiple columns, for example. The table is written in linear mode and appears on the screen only one page at a time. Analyzing the table is accordingly highly complicated, and the user ordinarily makes his task easier by requesting a printout of the table on paper. This means that he must go through all the printing processes specific to that terminal. The printout is done page by page, with column headings repeated each page. Consequently, in the coded language, the administrator must decrypt the hierarchy of processing the query and reconstitute the plan by analyzing numerous columns and a large number of disordered lines on various pages. This makes the search long and difficult. Moreover, the administrator does not have a global view of the plan available, which would ordinarily be too long to obtain and would be overly bothersome in proportion to the advantage he would gain from it. Nevertheless, the administrator must overcome this disadvantage as well as he can, in order to decide how to optimize the query. For example, if he modifies the query, he must rewrite the modified query, analyze the new table, and compare it with the old table. The tools used to create an analysis table do not save the preceding table in memory, and hence it must be printed out in order to be able to refer to it to make the comparison with the new table. In practice, the comparison accordingly consists of gathering the long, difficult analyses of the two tables. Finally, the administrator lacks a visual display of information about the objects used in the processing. It will be appreciated that under these conditions, optimizing a query is difficult to achieve rapidly, even for a skilled administrator.