A continuing demand exists to join multiple databases so as to enable transparent access to data stored therein. "Transparent" implies that an application program at a specific site is able to access data from all connected databases, without being aware of the origin of the data, nor of any incompatibilities which might exist between a local database and any of the plurality of coupled remote databases. It is to be understood that the terms "remote" and "local", as used herein, not only refer to physical locations, but also to databases that are located at a single site (e.g., on one or more computers), but are controlled by different operating systems or database protocols.
In order to provide a transparent interface for heterogeneous databases, the prior art has employed one database as an interface and has enabled that interface, under control of a database management system (DBMS), to access data from other databases in accordance with data entries contained in an interface table. Upon receiving a query, the receiving DBMS performs a query optimization procedure to decide upon an efficient method for accessing the requested data.
During such a query optimization action, various types of join methods are often considered. A join method is used when rows from an "outer" table are concatenated to rows of one or more other tables (i.e., "inner" tables), in accord with a determined criteria. As used herein, the term "table" includes any tabular listing of data. An outer table is one from which a search name is retrieved and an inner table is one from which data is retrieved, based on the search name retrieved from the outer table.
A merge-join is one of a plurality of join methods used in responding to a database query, wherein data from an outer table is joined with data from an inner table. In SQL (i.e., a commonly used query language) a merge-join operation is specified by a predicate that is similar to: EQU table1.column1.vertline.=table2.column2.
To perform a merge-join operation, both the inner table and the outer table are ordered, either by sorting or by accessing the required data via an index access. The outer table involved in the merge-join operation is then scanned just once and the inner table is scanned once unless there are repeated identical values in the outer table. If there are repeated identical values in the outer table, a group of rows in the inner table may be scanned again.
The following is an example of a merge-join operation, as performed in the prior art. Assume that column A in tables T1 and T2 has the following values:
______________________________________ Outer Table Inner Table T1:column A T2:column A ______________________________________ 2 1 3 2 3 2 3 3 ______________________________________
The steps for doing a merge join between T1 and T2 are:
Read the first row from T1; the value for A is "2"; PA1 Scan T2 until a match is found, and then join the two rows that match; PA1 Keep scanning T2 while the columns match, joining matching rows; PA1 When the "3" in T2 is read, go back to T1 and read the next row; PA1 The next value in T1 is "3" which matches T2, so join the rows; PA1 Keep scanning T2 while the columns match, join matching rows; PA1 The end of T2 is reached; PA1 Go back to T1 to get the next row--note that the next value in T1 is the same as the previous value from T1, so T2 is scanned again starting at the first "3" in T2 (the database manager remembers this position).
When a merge-join operation is performed between tables residing in a remote database and a local database, respectively, either (i) the table from the remote database is imported in its entirety to the local database, where the merge-join operation is performed; or (ii) the table from the local database is exported to the remote database, where the merge-join operation is performed. In both cases, a complete table is communicated between databases, increasing the communications cost of the merge-join operation.
Accordingly, it is an object of this invention to provide an improved method and apparatus for performing a merge-join operation, when a remote database includes a table with values to be joined to a table in a local database.
It is another object of this invention to provide an improved method and apparatus for performing a merge-join operation, wherein reduced communication costs are achieved.