The present invention relates to database management techniques, and more particularly to a database management system which is applicable to a parallel database management system having a function of executing a program module incorporated therein by a user.
The present invention utilizes the following three known techniques related to a database management system (hereinafter abbreviated as xe2x80x9cDBMSxe2x80x9d) for managing a database (hereinafter abbreviated as xe2x80x9cDBxe2x80x9d):
(1) Parallel DB Processing;
(2) SQL3; and
(3) Object Relational DBMS.
In the following, these three known techniques will be briefly described.
(1) Parallel DB Processing
This is a method of parallelly executing database processing which satisfies a user""s request using a plurality of processors for processing user""s queries involving a large amount of data. An example of this method is described in JP-A-8-137910 (Reference 1). In the method of Reference 1, a processor receives a user""s query, and a DBMS controls the executions of a plurality of engine processors (execution servers) such that a load is optimally distributed among the engine processors.
(2) SQL3
SQL3 is a draft of a database language specification for which the International Standard Organization (ISO) is currently working for standardization. For example, according to xe2x80x9cInformation technologyxe2x80x94Database languagesxe2x80x94SQLxe2x80x94Part 2: SQL/Foundationxe2x80x9d ISO/IEC JTC1/SC21 N10489 (Reference 2), SQL3 permits a description as follows:
This description provides definition statements for an abstract data type (hereinafter abbreviated as xe2x80x9cADTxe2x80x9d. {circle around (1)} in the definition statements indicates that ADT sgmltext_t is composed of a component of BLOB (Binary Large Object) type referenced by a name xe2x80x9ctext.xe2x80x9d
Also, {circle around (2)} in the definition statements indicates that an ADT function extract( ) can be applied to data having an ADT sgmltext_t type.
Further, {circle around (3)} in the definition statements indicates that the ADT function extract( ) is related to an external function labelled p_sgml_extract described in C language.
The user can define his inherent data type using the ADT as described above, thereby realizing functions corresponding to data access, inheritance and so on by methods in a general object-oriented program language.
(3) Object Relational DBMS
They say that a conventional relational DBMS (hereinafter abbreviated as xe2x80x9cRDBMSxe2x80x9d) based on a relational data model is not suitable for handling data having a complicated structure such as multimedia data because it cannot provide close representations of such data and also implies other problems on performance. For this reason, an object relational DBMS (hereinafter abbreviated as xe2x80x9cORDBMSxe2x80x9d), which introduces an object orientated concept into RDBMS, has been proposed as described in xe2x80x9cObject Relational DBMSsxe2x80x9d written by Michael Stonebraker, translated by Yoshinobu Ohta, and published by International Thompson Publishing Japan, August 1996 (Reference 3). Reference 3 mentions as a basic requirement of ORDBMS that ORDBMS should be capable of handling complicated objects. Reference 3 also mentions that the ORDBMS should be able to use the following user defined types and user defined functions:
This description provides definition statements for a user defined complex type phone_t. The definition statements indicate that the complex type phone_t is composed of three components: a variable character string type element of three bytes or less referenced by a name xe2x80x9careaxe2x80x9d ({circle around (4)} in the definition statements); a variable character string type element of seven bytes or less referenced by a name xe2x80x9cnumberxe2x80x9d ({circle around (5)} in the definition statements); and a variable character string type element of 20 bytes or less referenced by a name xe2x80x9cdescriptionxe2x80x9d ({circle around (6)} in the definition statements).
An example of definition statements for a user defined function is shown in the following:
create function Northness-equal (point, point) returns
Boolean
with selfunc=selectivity_comp
external name xe2x80x98/usr/Northness_equalxe2x80x99
language C;xe2x80x83xe2x80x83{circle around (7)}
This description provides definition statements for a user defined function Northness_equal( ). {circle around (7)} in the definition statements indicates that the user defined function Northness_equal( ) is associated with an external function labelled /usr/Northness_equal described in C language. As to an external function, Reference 3 describes that good ORDBMS should be able to dynamically link a user defined function so as not to consume an address space of DBMS for nothing until the user defined function is required. Such user defined type and user defined function can be used in correspondence to ADT and ADT function described by SQL3, respectively.
The present inventors have found the following problems as a result of investigating DB systems utilizing the known techniques described above.
First, a conceptual diagram representing an exemplary configuration of a conventional DB system is illustrated in FIG. 1. The illustrated DB system 100 is a system for managing documents described in SGML (Standard Generalized Markup Language). A DBMS 120 for managing the DB system 100 comprises a request reception server 130 for receiving a query 104 from a user; a plurality of execution servers 140-1-140-n for executing database processing in accordance with instructions from the request reception server 130; and a single dictionary server 160 for managing definition information of the system 100, and the DBMS 120 is adapted to control general parallel DB processing. These servers are interconnected through a communication path 180.
Assume that a definition for management of SGML document, subjected to DB processing by the DBMS 120, is described by SQL3 in the following manner:
The user of this DB system 100 will issue a desired query for data in a DB described in SGML (hereinafter abbreviated as xe2x80x98SGML textxe2x80x99), using the ADT sgmltext_t type.
{circle around (8)} in the description statements indicates that the ADT sgmltext_t type has text of BLOB type as a component.
{circle around (11)} in the description statements represents the structure of data corresponding to report in the user""s DB model using a table reports. More specifically, in correspondence to the xe2x80x9creportxe2x80x9d comprising xe2x80x9cpublished datexe2x80x9d and xe2x80x9creported contentsxe2x80x9d as its components, the table reports is defined to comprise a DATE type column published_date and an ADT sgmltext_t type column contents.
For processing a large amount of SGML documents in parallel, a record 152-1 in the table reports and a SGML text 154-1 are held in storage devices 150-1-150-n respectively accessed by the execution servers 140-1-140-n. For rapidly searching for xe2x80x9creportxe2x80x9d with a condition defined by xe2x80x9cpublished date,xe2x80x9d a column published_date of the table reports is indexed using a general indexing function provided by the execution servers.
{circle around (9)}-{circle around (10)} in the description statements define an ADT function extract( ) which is a function for extracting text data delimited by tags (156, 158 in FIG. 1) from the SGML text 154-1, and requires the following two input parameters:
(1) Original SGML text from which text data is extracted; and
(2) a tag name for specifying a portion to be extracted.
{circle around (10)} in the description statements is an external function p_sgml_extract( ) which is defined as a function for realizing an ADT function extract( ). An object code 144-1 for realizing the external function p_sgml_extract( ) is included in a plug-in program module (hereinafter a xe2x80x9cplug-in modulexe2x80x9d) 142-1. The plug-in module 142-1 is a _program module incorporated in the execution server for realizing a SGML document data management function of the DB system 100.
In this example, control information based on document structure information on SGML documents is used for performing partial extraction of the SGML text 154-1 delimited by specified tags 156, 158. This control information includes structural information for structuring a partially extracted data as a SGML document, and is indispensable information for creating an extraction result. The control information for the partial extraction processing is called xe2x80x9cextraction parameters.xe2x80x9d The extraction parameters are based on the SGML document structure, and are commonly utilized for SGML texts having the same SGML document structure. In this DB system 100, the extraction parameters are collectively managed in the system by the dictionary server 160.
The dictionary server 160 holds the extraction parameters 172 in an associated storage device 170. The structure of SGML documents in the DB is permanently represented by a column for holding a SGML text such that the format or document structure of the xe2x80x9creported contentsxe2x80x9d in the xe2x80x9creportsxe2x80x9d is fixed. Accordingly, the extraction parameters are also permanently represented by a column for holding a SGML text to be processed. Thus, the dictionary server 160 manages the extraction parameter 172 on the basis of table names and column names so that each of the execution servers 140-1-140-n can acquire the extraction parameters 172.
With the configuration described above, the partial extraction processing is executed for the SGML text in accordance with the following procedure.
(1) Based on the table name and the column name of a column in a table which holds a target SGML text to be handled, an access to the dictionary server 160 to acquire extraction parameters is carried out on an execution server.
(2) The partial extraction processing utilizing the extraction parameters acquired in step (1) is carried out on an execution server. The execution of steps (1), (2) in this procedure is controlled by a plug-in module 144-1.
Next, description will be made on a search operation on the DB system 100 including the partial extraction processing of a SGML text.
For example, a search request from the user requesting to xe2x80x9cextract abstracts of reports, the published date of which is later than Oct. 15, 1996xe2x80x9d may be described by SQL3 in the following manner.
SELECT extract(contents, xe2x80x98abstractxe2x80x99)
FROM reports
WHERE published_date greater than xe2x80x98Oct. 15, 1996xe2x80x99
Database processing appropriate to this search request is executed in the following procedure:
(1) A set of records in reports satisfying the conditions defined by the WHERE phrase are acquired using an index set to the column published_date in the table reports.
(2) Based on the set of records acquired in step (1), SGML texts are sequentially retrieved from the contents of records in reports. Then, an external function p_sgml_extract( ) for realizing the ADT function extract( ) is called to extract abstracts.
In this procedure, the processing at step (2) for sequentially retrieving SGML texts to extract the abstracts is executed by each of the execution servers in consideration of efficient utilization of the parallel processing function for faster processing, and a reduction in the amount of data transferred to the request reception server for making up a search result.
Each execution server calls the external function p_sgml_extract( ) for partially extracting abstracts, and passes the execution control to the plug-in module 144-1. The plug-in module 144-1, to which the execution control has been passed, accesses the dictionary server 160 to acquire the extraction parameters 172, and executes the extraction processing utilizing the extraction parameters 172.
The system illustrated in FIG. 1, however, implies the following problems.
In the database processing for the foregoing search, all of plug-in modules on a plurality of execution servers 140-1, 140-2, . . . , 140-n, running in parallel, make an access to the single dictionary server 160, so that the processing for retrieving the extraction parameters 172 is intensively executed in the dictionary server 160.
In the conventional processing scheme illustrated in FIG. 1, the parallel processing for distributing a load adversely affects with respect to the access to the dictionary server 160. Specifically, as the number of execution servers is larger, the dictionary server 160 suffers from a larger load, and consequently, the search processing capabilities of the entire system are degraded due to a limited performance of the dictionary server 160.
Also, for sequentially extracting records satisfying the condition defined by the WHERE phrase, as in the aforementioned query statements, the dictionary server 160 is burdened with a load larger than the actual number of execution servers.
To solve the problem mentioned above, the following method is taken into account.
The scheme illustrated in FIG. 1 causes the problem because plug-in modules are executed on a plurality of execution servers 140-1, 140-2, . . . , 140-3 so that the single dictionary server 160 is intensively accessed by these plug-in modules.
The plug-in modules on the plurality of execution servers individually access the dictionary server 160 because they intend to acquire the extraction parameters 172 required for the extraction processing. However, the extraction parameters 172 required during the processing for the query are the same in either of the execution servers. In addition, the extraction parameters 172 need not be acquired by directly accessing the dictionary server 160 from their execution environments on the respective execution servers. Therefore, if all the execution servers are allowed to reference the extraction parameters 172 acquired from the dictionary server 160 by any means, the execution servers can individually execute the extraction processing without accessing the dictionary server 160.
To realize the concept mentioned above, the present inventors have devised a method processed by a procedure as illustrated in FIG. 2. This procedure will be described below.
(STEP 1) A request reception server 230 acquires extraction parameters 272 from a dictionary server 260. An external function 234 for acquiring the extraction parameters 272 from the dictionary server 260 is provided by a plug-in program module 232, and the request reception server 230 calls the external function 234.
(STEP 2) The request reception server 230 transmits the extraction parameters 272 together with an execution instruction to respective execution servers 240-1, 240-2, . . . , 240-n.
(STEP 3) An external function 244-1 for executing extraction processing on each execution server (e.g., 240-1) executes the extraction processing with reference to the extraction parameters 272 transmitted thereto from the request reception server 230. The external function for executing the extraction processing using the extraction parameters 272 as input parameters is provided by a plug-in module 242-1. The execution server 240-1 passes the extraction parameters 272 transmitted from the request reception server 230 as input parameters for the external function 244-1, when it calls the external function 244-1.
However, the aforementioned three known techniques cannot control the execution of the plug-in modules 232, 242-1 in accordance with the procedure described above, if they are used without any modifications.
It is further desirable that the user can specify the control for the execution of plug-in modules as mentioned above. Thus, the inventors directed their attention to a method of utilizing an interface definition language (IDL) which is described in xe2x80x9cThe Common Object Request Broker: Architecture and Specificationxe2x80x9d OMG Document Number 91.12.1, Revision 1.1 (Reference 4) as a prior art technique related to the specification of a function definition.
This method defines an interface between modules with the IDL in a software architecture called xe2x80x9cCORBA.xe2x80x9d The interface is associated with a programming language such as C language or the like, and a module for connection called xe2x80x9cstubxe2x80x9d is produced. A flexible inter-module communication is enabled through this stub module. However, the specifications of the IDL described in Reference 4 do not permit the user to directly specify to control the execution of external functions as mentioned above. The inventors added modifications to the specifications of the IDL to permit the user to directly specify to control the execution of external functions as mentioned above.
It is therefore a first object of the present invention to provide a database management method and a parallel database management system which are capable of eliminating the problems described above to improve the system performance.
It is a second object of the present invention to provide a database management method which permits the user to directly specify to perform an execution control for plug-in modules.
To achieve the above objects, the present invention provides a parallel database management system including a request reception server for receiving a request from a user, and a plurality of execution servers for parallelly executing database processing appropriate to the request from the user in accordance with instructions of the request reception server, wherein the request reception server and the execution servers have a function of executing a plug-in module incorporated in a database system by the user. The parallel database management system may comprise:
means for causing the request reception server to recognize that information to be passed to the plug-in modules as an input, when the plug-in modules are executed on the execution servers, is acquired by executing another plug-in module on the request reception server, when the request reception server creates an execution procedure code for instructing an execution procedure for the database processing appropriate to the request from the user;
means for executing the plug-in module on the request reception server in accordance with the recognition to acquire the information to be passed as an input; and
means for causing the request reception server to edit the execution procedure code so as to pass the acquired information as an input when the plug-in modules are executed on the execution servers.
The present invention also provide a database management system which has a function of creating control information for the database management system to call a function of a plug-in module based on a description of an interface specification for the plug-in module, wherein the description of the interface specification for the plug-in module includes:
an instruction for adding an input, which is not included in a combination of inputs defined in an interface for calling a function of a plug-in module in response to a request from the user, to a combination of inputs defined in an interface for the database management system to call a function of a plug-in module; and
an instruction for acquiring information to be passed to the plug-in modules as an input when the plug-in modules are executed on the execution servers, by executing another plug-in module on the request reception server, and
the database management system may comprise means for controlling the execution of the database including the execution of the plug-in module in accordance with the instructions included in the description.