The present invention relates to database operations, and in particular to equi-join operations among split tables.
Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
A common database operation in a relational database is the join operation. Generally, a join of two data sources creates an association of objects in one data source with objects that share a common attribute in another data source. A typical data structure for data sources is a data table (or simply, table) comprising rows and columns Each row (or record) of the data table represents an object. Each column represents attributes of the object. For example, a data table may be defined for inventory in a retail store. The inventory items (e.g., pants, shirts, toasters, lamps, etc.) may constitute the objects represented by the data table. The attributes of each item may include such information as the name of the item, the number of items at that store, the location of the item in the store, and so on. Instances of an attribute are referred to as “attribute values”, “actual values”, or simply “values.” An example of such a data table is shown in FIG. 14A, where each row 1402 represents a store item. Each row 1402 comprises attributes of the item columns 1404a-1404c. Each row 1402 may include an ID attribute 106 that identifies the row. For example, the ordinal position of a row 1402 in the data table may be used as the ID attribute.
FIG. 14B shows an example of another data table called Mail-Order. A join operation between the Inventory and Mail Order data tables can be performed. For example, consider a so-called “equi join” type of join operation where the join condition (join predicate) specifies a relationship (e.g., equality) between attributes that are common to both data tables. Suppose the join condition is: items in the Inventory data table that are the same as the items in the Mail-Order data table. For example, the join expression might be formulated as “Table Inventory inner join Table MailOrder on Inventory.Item=Mail-Order.Item”.
An execution plan (query plan) for performing the join operation may include the following steps:                1. read out a row from the Inventory table        2. compare the actual value of the Item attribute in the row that was read out from the Inventory table with the actual value of the Item attribute in a row of the Mail-Order table        3. if there is a match, then output the row that was read out from the Inventory table and the matching row in the Mail-Order table        4. repeat steps 2 and 3 for each row in the Mail-Order table        5. repeat steps 1-4 for each row in the Inventory tableA result of the join operation can be represented by the data table shown in FIG. 14C.        
A database may comprise data tables that contain thousands of records each. In addition, records may have tens to hundreds of attributes each, and the actual values of some attributes may be lengthy (e.g., an attribute that represents the name of a person may require an allocation of 10-20 characters of storage space). Such databases can impose heavy requirements in the storage of their data. Accordingly, a practice of using dictionaries has arisen, where the actual values (e.g., 10-20 characters in length) of instances of an attribute in the data table are replaced by (or otherwise mapped to) an associated “value ID” (e.g., two or three bytes in length).
Consider the Inventory table and the Mail-Order table, for example. The actual values for instances of the Item attribute in the Inventory table include “pants”, “shirts”, “toasters”, and “lamps”. A dictionary can be defined for the Item attribute. For example, the dictionary may store the actual values of the Item attribute in alphabetical order and the value IDs that are associated with the actual values might be the ordinal position of the actual values in the dictionary.
An actual value in the data table is represented only once in the dictionary. For example, the actual value “lamps” occurs in twice in the Mail-Order table, but there is only one entry in the dictionary; thus, the dictionary might look like:                lamps        pants        shirts        toastersThe value ID associated with the actual value “lamps” could be 1, being located in the first position in the dictionary. The value ID associated with the actual value “pants” could be 2, being the second position in the dictionary, and so on.        
FIG. 15 shows the Inventory and Mail-Order tables of FIGS. 14A and 14B, modified by the use of a dictionary, more specifically a central dictionary. In particular, the actual values for instances of the Item attribute in the data tables (i.e., text) have been replaced by their corresponding associated value IDs (i.e., an integer). It can be appreciated that the use of dictionaries can reduce the storage burden of large databases.
The distribution of databases across separate database servers is commonly employed, for example, to distribute the storage burden across multiple sites. In a distributed database configuration, one or more constituent data tables of the database are partitioned (split) into some number of “partitions,” and the partitions are distributed across many database servers. While the processing of certain queries in a distributed database configuration may be accomplished using only the data within a given partition of a data table, queries that involve a join operation require access to data from all of the partitions of the data tables being joined.
The execution plan of a join operation involving split (partitioned) data tables conventionally involves communicating the actual values of the attribute(s) specified in the join condition among the partitions in order to evaluate the join condition. One can appreciate that the execution plan may therefore entail a significant amount of data communication among the constituent partitions. As explained above, a dictionary can be used to reduce the space requirements for storing attribute values. Accordingly, each partition may be provided with its own local dictionary (rather than the central dictionary indicated in FIG. 15), the idea being that the associated value IDs can then be communicated among the partitions instead of the actual values. However, the value IDs in a given local dictionary are generated independently of the values IDs in the other local dictionaries. In other words, value IDs locally generated in one partition of a data table may have no correlation to value IDs locally generated in another partition of that data table. Suppose, for example, the Item attribute is specified in a join condition. Suppose further that the actual value “pants” has a value ID of 2 in the local dictionary of one partition, a value ID of 7 in the local dictionary of another partition, and a value ID of 15 in yet another partition. The execution plan for the join operation may communicate the multiple different value IDs for “pants” (i.e., 2, 7, 15) among the partitions. However, the value IDs would be meaningless in any one partition for the join operation because value IDs only have meaning for the partition in which they were generated. For example, while the value ID 2 may be associated with “pants” in one partition, the value IDs 7 and 15 do not, and in fact very likely may be associated with completely different items; the value IDs could not be used to perform a join operation.
These and other issues are addressed by embodiments of the present invention, individually and collectively.