The present invention relates in general to a database system for resolving an invocation for the overloaded routines, and more particularly to the technology which is effectively applied to a database system which has the function of the overloaded routines and which serves to analyze an invocation for a group of overloaded routines to determine the called routine.
For the demand of users who want to handle the data having various structures in a database, the study and the development for an object database have been carried out, while the attempt of adopting the object oriented concept into a relational database has been made, and the standardization therefor has been advanced in the form of the next term SQL (Structured Query Language).
In the next term SQL, in order to handle the data having various structures, the data type called the abstract data type which is defined by a user is introduced thereinto. In addition, the realization of manipulation for those data is made possible on the basis of a function and a procedure (hereinafter, referred to as xe2x80x9croutinesxe2x80x9d for short, when applicable), and the function of the multiple definition for those routines is provided.
The SQL as the database query language will hereinafter be described as an example. But, this description will be applied similarly to even any one of other database query languages each having the function similar to the abstract data type and the multiple definition.
First of all, the multiple definition and the abstract data type of the next term SQL will hereinbelow be described as the prior art relating to the present invention. Firstly, the multiple definition will now be described. Defining a plurality of routines which have the same name and which are different in the number of arguments and the data type from one another is called the multiple definition (hereinafter, by the arguments are meant the arguments on the routine calling side and by the parameters are meant the arguments on the called routine side). The database system selects and applies, for the routine invocation, the routine which is optimal in both of the number of parameters and the data type from among the routines which are already overloaded.
In SQL/PSM which has been developed from the database structured query language SQL, a function and the feature of the multiple definition thereof are adopted into the SQL. SQL/PSM is, for example, described in an article of ISO/IEC DIS 9075-4: 1996, xe2x80x9cDatabase languages SQL Part 4: Persistent Stored Modules (SQL/PSM)xe2x80x9d or the like. This multiple definition of the function of SQL/PSM will now be described by giving an example.
FIG. 17 is a diagram showing a conventional example of an SQL statement based on which the overloaded functions are called. In this example 101 of the SQL statement, the function numeric_string is called in which the numeric number type is adopted as the argument and the values associated therewith are converted into the character string to be returned. Reference numeral 102 designates the definition of the functions. In the definition 102 of the functions, it is shown that a function 103 which is defined in the definition of the function 102 adopts as the argument an INTEGER type (integral number type) and a function 104 adopts as the argument a FLOAT type (round number type). In actual, those functions are defined on the basis of the statement which is used to define the functions such as a CREATE FUNCTION.
On the basis of an SQL statement 105, a variable x is declared as being of the INTEGER type and also a variable y is declared as being of a DECIMAL type (decimal integral number). In an SQL statement 106, the function numeric_string is called with the variable of the INTEGER type as the argument. For the function numeric_string, since the data type of the argument is the INTEGER type, a function 103 is applied in which the INTEGER type is adopted as the argument.
On the other hand, in an SQL statement 107, the function numeric_string is called with the variable y of the DECIMAL type as the argument. In the definition 102 of the functions, the function numeric_string is not yet defined in which the FLOAT type is adopted as the argument. Therefore, a function 104 is applied in which the FLOAT type as the data type which is lower in the precedence than the DECIMAL type on a data type precedence list 108 of the numeric number type and which has the highest precedence on the data type precedence list 108 is adopted as the argument.
In this connection, the data type precedence list 108 is such that the data types which can be applied to the data type of the numeric number type which is specified to the argument of the routine invocation are arranged in the order of the precedence from the left-hand side to the right-hand side. For example, in the data type precedence list 108, when the data type of the argument of the routine invocation is the DECIMAL type, the routines which have the DECIMAL type and the FLOAT type as the parameters, respectively, can be applied. In addition thereto, since the DECIMAL type is formerly described, the routine having the DECIMAL type as the parameter is applied. Even if the number of arguments and the name of the routine match those of the associated one, both of the routine which has as the parameter the data type which is absent on the data type precedence list 108 and the routine which has as the parameter the data type which is higher in the precedence than the data type specified to the argument are not applied.
As described above, from among the overloaded routines, there is applied the routine which has as the parameter the data type which is most suitable to the data type of the argument of the routine invocation. The determination of the application routine is called the resolution of the overloaded routines.
The precedure of the data types in the data type precedure list 108 corresponds to the criterion which the database uses to carry out the analysis of the query, and in this example, the precedence is determined in the order of the INTEGER type, the DECIMAL type and the FLOAT type. But, it should be noted that even if not following this precedence order, as long as the regular precedence is determined among the data types, that precedence can be made the precedence for the data type precedence list 108.
Next, the description will hereinbelow be given with respect to the Abstract Data Type (hereinafter, referred to as xe2x80x9can ADTxe2x80x9d for short, when applicable) of the next term SQL. The ADT corresponds to the class in the object orientation and is the data type which a user who has the concept of inheritance can define. For a certain ADT, the data type of the subtype which can be said as the subclass thereof can be defined. The host data type when viewed from the data type of the subtype is called the supertype. The data type of the subtype can inhetit the attribute of the data type of the supertype and also can substitute both of the data type in which the ADT variable is defined and the data type of the subtype thereof for variables which are defined with the ADT (hereinafter, referred to as xe2x80x9cADT variablesxe2x80x9d for short, when applicable). Unlike the normal substitution of the data type, in the substitution in this case, the data type from which the substitution is made is not type-converted into the data type to which the substitution is made, but the data type to which the substitution is made is applied to the data type from which the substitution is made.
On the other hand, the data type of the supertype can not be substituted for any of the ADT variables. Therefore, with respect to the variable which is defined with the ADT having the subtype, the data type thereof can not be specified when analyzing the SQL and the data type thereof can not be aware of until the SQL is executed. This is applied to the variables which are defined with the ADT as well as to the string which is defined with the ADT.
In this connection, both of the next term SQL and the ADT are, for example, described in an article of Andrew E. Wade, Ph. D.: xe2x80x9cObject Query Standardsxe2x80x9d, ACM SIGMOD Record, Vol. 25, No. 1, pp. 87 to 92, March 1996 or the like. In addition, Draft of Standardization of the next term SQL is, for example, described in an article of ISO/IEC JTC 1/SC 21/WG3 DBL-MCI-004, ISO Warking Draft Database Language SQL, 1996.
Next, the description will hereinbelow be given with respect to the resolving method in the case where the routine which adopts as the argument the ADT shown in the next term SQL is overloaded with reference FIG. 18. FIG. 18 is a diagram showing a conventional example of the SQL statement based on which the invocation of functions each of which adopts the ADT as the argument and which are overloaded is being carried out. In an example 201 of the SQL statement, a function dollar_amount is called in which the ADT belonging to a money type which is defined with the ADT is received as the argument to return the result which has been obtained by the conversion into a dollar type.
There is shown an example in which a money type as the ADT, and as its subtypes, a yen type, the dollar type and a mark type are defined in an abstract data type inheritance hierarchy 202. While not particularly illustrated in the figure, the ADTs and the inheritance hierarchy thereof are defined on the basis of the data type definition statement such as a CREATE TYPE. Reference numeral 203 designates the definition of functions. In the figure, it is shown that a function 204 adopts as the argument the yen type as the ADT, and a function 205 adopts as the argument the money type as the ADT.
This example will hereinbelow be described concretely. In an SQL statement 206, a variable x is declared as being of the money type as the ADT. As a result, the yen type, the dollar type or the mark type as the subtype can be substituted for the variable x.
In an SQL statement 207, a return value of a function yen is substituted for the variable x. This shows that the yen type is substituted for the variable x. In an SQL statement 208, a function dollar_amount is called with the variable x as the argument. Since the data type of the argument x at this time is the yen type, the function 204 which has the yen type as the parameter is applied.
In an SQL statement 209, a return value type of a function dollar ( ) is substituted for the variable x. This shows that the dollar type is substituted for the variable x. In an SQL statement 210, the function dollar_amount is called with the variable x as the argument. While the data type of the argument x at this time is the dollar type, no function which adopts the dollar type as the argument is defined in the definition table 203.
Then, the function 205 is applied which adopts as the argument the money type having the precedence which ranks next to the dollar type on the data type precedence list 211 of the dollar type. This reason is that since the dollar type as the subtype can be substituted for the money type, the function 205 which has the money type as the parameter can also be applied. From among the overloaded routines, the optimal routine is applied in response to the routine invocation in the case where the ADT is specified to the argument by the above-mentioned method.
A data type precedence list 211 of the dollar type shows the precedence of the dollar type as the ADT for which the dollar type can be substituted, and the supertype thereof. In the next term SQL, in the case of the single inheritance, in the data type precedence list of a certain ADT, the data type itself of interest is given top precedence next to which the supertype thereof ranks next to which the supertype thereof ranks . . . In such a way, the precedence is determined. Incidentally, in this example, the description will be given with respect to the case of the single inheritance in which one ADT can have only one direct supertype.
A plurality of arguments of the overloaded routines may also be adopted. Likewise, the multiple definition for the routines which adopts a plurality of arguments of the ADT is also possible. Next, the description will hereinbelow be given with respect to the case where the functions are overloaded which adopt as the arguments the ADT and the data type other than the ADT, respectively, with reference to drawings.
FIG. 19 is a diagram showing a conventional example of calling a function in the case where the functions are overloaed which adopt as the arguments the ADT and the data type other than the ADT, respectively. In the figure, reference numeral 301 designates an example of an SQL statement which is used to call a function sale_info for returning the sale information of the real estimate. In an SQL statement 302, variables are declared which are specified to the arguments of the function sale_info. That is, a variable price is declared with the money type as the ADT, a variable size is declared with the INTEGER type (integral number type) and a variable property is declared with a house type as the ADT.
The inheritance hierarchies of the ADTs are shown in inheritance hierarchies 303 and 304 of the abstract data type. The inheritance hierarchy 303 shows that a house type is present as the subtype of a real_estate type and also both of a lodge type and a villa type are present as the subtypes of the house type. In addition, the inheritance herarchy 304 shows that a yen type, a dollar type and a mark type are present as the subtypes of the money type.
In an SQL statement 305, the function sale_info is called with the variables price, size and property as the arguments. Since the subtype can be substituted for the variable which is defined with the ADT, the four data types of the money type and as the subtypes thereof, the yen type, the dollar type and the mark type can be substituted for the variable price of the money type. Likewise, the three data types of the house type and as the subtypes thereof, the lodge type and the villa type can be each substituted for the variable property of the house type. Even if any of other numerical value types is substituted for the variable size, the variable size is not changed from INTEGER type at all.
Therefore, in the case of this example, as shown in a table 306, there are conceivable the 4xc3x973=12 kinds of data types of the arguments. For the combination of the twelve kinds of data types of the arguments, the functions to be applied are respectively determined from among the functions 308 to 312.
In the next term SQL, for the resolution of such overloaded routines, the precedence is determined in the order of the left-hand side to the right-hand side of the arguments. In this example shown in FIG. 19, if it is assumed that the data types of the arguments of the function sale_info are, from the left-hand side, the yen type, the INTEGER type and the lodge type, the functions which the data types of the respective arguments can substitute are functions 308, 311 and 312. If the parameters of those functions are compared with one another from the left-hand side, for the yen type as the data type of the first argument, the function 308 which adopts likewise as the parameter the yen type is applied.
In such a way, if for the combinations of the arguments of the ADT shown in the table 306, the optimal functions to be applied are determined one by one from among the functions 308 to 312, then the functions which are respectively indicated by arrows will be applied. In this example, any one of the functions 308, 309 and 311 is applied in accordance with the data types which are substituted for the variables price and property as the arguments. On the other hand, even if what data type we substitute for each of the variables price and property, neither the function 310 nor 312 is applied at all.
While in this example, the precedence having the order from the left-hand side to the right-hand side of the arguments is employed in accordance with the method of resolving the overloaded routines of the next term SQL, even if the specific precedence is present, likewise, the resolution of the overloaded routines in the case of a plurality of arguments can be made possible.
As to the prior art of overloaded routines, please see an article of ISO/ITEC DBL:MCI-006.
In a database system, for the query for a database, the analysis of a syntax (parsing) and the analysis of meaning are carried out to produce the results of analyzing the query and the request for the query is processed on the basis of the query analysis results.
In the case where the function of calling a routine having an ADT as an argument is incorporated in the database, from the character in which the substitution of any one of other data having the inheritance hierarchy is possible, the analysis of the query for the database is not sufficient for the resolution of the multiple definition.
However, if the resolution of the overloaded routines is carried out when processing the execution for the query for the database, since whenever calling one routine, the optimal routine needs to be determined from obtaining the definition information of the routines, it is not practical in terms of the processing time thereof.
For example, when the overloaded functions are described in the search conditions of the database, if obtaining the definition information of the routines and the resolution of the multiple definition are carried out every case, then it will take a very long processing time to process the several tens of thousands of data stored in the database.
Then, it is conceivable that the candidates are previously pruned for the overloaded routines when analyzing the query for the database and on the basis of this result, the multiple definition is resolved by processing the execution of the query for the database.
When pruning the candidates in analysis of the query for the database, as in the example shown in FIG. 19, the combinations of all of the data types which may be contained in the arguments of the ADT are checked, which enables the necessary minimum candidate to be determined.
However, in the case where the above-mentioned prior art is employed, in even the example shown in FIG. 19, for the 4xc3x973=12 kinds of combinations of the arguments as the product of the numbers of subtypes of the arguments of the ADT, the routines to be applied need to be determined, respectively, and hence the processing load becomes large. In addition, if the number of inheritance hierarchies of the ADT or the number of ADT arguments is increased, then the number of combinations of the data types which the arguments may adopt is explosively increased. Therefore, even when the number of overloaded routines is small, the processing of determining the routines to be applied needs to be executed by the number of combinations. As a result, it takes a very long time to execute the processing so that the performance which is practical as the database can not be realized.
In the light of the foregoing, the present invention was made in order to solve the above-mentioned problems associated with the prior art, and it is therefore an object of the present invention to provide the technology by which in the processing of pruning the candidates for resolving the overloaded routines when analyzing the query for a database, the candidates can be pruned with the less processing amount and also the necessary minimum candidate can be selected efficiently.
In order to solve the above-mentioned problems associated with the prior art, in the present invention, in the processing of analyzing the query for the database, the following steps are provided, whereby for the invocation for overloaded routines, the routines as the candidate are pruned efficiently and the routine to be applied is determined.
(1) The routine group sorting step: for the invocation for the routine, the definition information of routines in which the number of arguments and the name of the routine match those of the associated one is obtained to sort the group of routines thus obtained in the order of precedence with parameters as the key. In this connection, with respect to the precedence of the data types, for the numerical value types, the type precedence list is given precedence, and for the ADTs, the number of parents in the inheritance hierarchy (hereinafter, referred to as xe2x80x9cthe hierarchy level numberxe2x80x9d for short, when applicable), i.e., the ADT having the larger number of supertypes is given precedence. Incidentally, the functions 308 to 312 in the example shown in FIG. 19 are arranged in this sorting order.
(2) The reference utilization pruning step: for the sorted routine group which has been obtained on the basis of the above-mentioned routine group sorting step, the routine which is not applied to any of the actual data types of the ADTs of the arguments, but is applied to the data type on the definition (hereinafter, referred to as xe2x80x9cthe reference routinexe2x80x9d for short, when applicable) is searched from the head of the sorted routine group, and any of routines which are not applied even when any of data types is substituted for any of the ADT arguments each having the lower precedence than that of the reference routine with respect to the sorting order is deleted from the candidates.
In the example shown in FIG. 19, on the basis of the step (2), the function 312 is deleted from the candidates. However, the functions such as the function 310 each of which is the routine before the reference routine with respect to the sorting order and which are not actually applied still remains as the candidates.
(3) The sort order characteristic pruning step: assuming that in the sorted routine group which has remained after having executed the pruning processing in the above-mentioned reference utilization pruning step, arbitrary two routines are the routine A and the routine B in the order of the sorting precedence, if all of the data types each of which can be substituted for the parameter of the routine B can be each substituted for the parameters of the routine A as well, such a routine B is deleted.
The step (3) utilizes the characteristics that if a plurality of routines which can be invoked for the data type of a certain argument are present in the sorted routine group, then the routine which is given the top precedence in the sorting order is applied.
In the example shown in FIG. 19, since the data type which can be substituted for the parameter of the function 310 can also be similarly substituted for the parameter of the function 309, the function 310 is deleted from the candidates because there is no possibility of invoking the function 310.
(4) The application routine determining step: the routines each of which can be applied to the data types of the actual arguments are searched from the head of the sorted candidate routine group which has been obtained in the above-mentioned sort order characteristic pruning step, and then the routine which is found out first is applied.
As described above, according to a database system of the present invention, when analyzing the query for a database including the invocation of the overloaded routines, the routine which has the possibility of being applied is determined on the basis of the comparison of parameters among the overloaded routines. Therefore, in the processing of pruning the candidates for the overloaded routines resolution which is carried out when analyzing the query for the database, the candidates can be pruned with the less processing amount and also the necessary minimum candidate can be selected efficiently.