The present invention relates to method and apparatus for processing a database suitable for query processing in a relational database management system.
In a database management system (DBMS), particularly a relational DBMS, a query expressed in a non-procedual language is processed and an internal processing procedure is determined and executed. Principal methods of prior art query processing include a method for determining a single internal processing procedure based on a predetermined rule (for example, Smith, J. M., et al "Optimizing the Performance of a Relational Database Interface", CACM Vol. 18, No. 10, Oct. 1975, pp. 568-579) and a method for determining by cost evaluation an optimum one of a plurality of candidate processing procedures selected in accordance with various statistical information (for example, Selinger, P. G., et al. "Access Path Selection in a Relational Database Management System" Proc. ACM-SIGMOD, 1979, pp. 23-34). In the former, a load to prepare the processing procedure is low but it has problems in a validity of a uniformly set rule and an optimization of a selected internal processing procedure. In the latter, a load to manage various statistical information, preparation of procedure for processing a plurality of candidate and cost evaluation thereof, but it provides an optimum processing procedure.
Where a query language is combined with a host language (COBOL, PL/I etc.) the query is pre-processed before the execution of an application program to prepare an internal processing procedure which is in an execution form. In a query expression, variables in the host language are frequently described in a retrieval condition expression. Those variables are substituted by constants when the internal processing procedure which is a result of the pre-processing is executed.
The database comprises a relation which appears to a user as a two-dimension table, including row and columns. The row comprises one valve from every column of the table.
In the prior art method for determining an optimum processing procedure by the cost evaluation based on a ratio of data which satisfies the retrieval condition which appears in the query, the variables appear in the retrieval condition expression in the query expression during the pre-processing. Accordingly, it is not possible to estimate the ratio of data which satisfies the retrieval condition expression so the cost evaluation cannot be attained. As a result, where the variables appear in the retrieval condition expression in the query expression, the cost evaluation is done based on a default value applied as a ratio of data which satisfies the condition expression depending on the type of retrieval condition expression, or a single internal processing procedure is prepared based on a predetermined rule. However, the resulting internal processing procedure has no clear decision criterion regarding optimization and it lacks validity.