In order to handle an ever growing amount of information, database management systems have recently been implemented on what are known as distributed or parallel architectures. One form of distributed architecture is the shared-nothing architecture. This architecture is called shared-nothing because the architecture does not include a single point of failure, such as a shared disk or shared memory.
A typical shared nothing system includes a plurality of database partitions coupled together by a network. The database partitions generally include a set of storage devices, such as disk drives, and one or more processors. The network may comprise various types of networks, such as a local area network, and physically diverse paths. Data in a shared nothing system is partitioned using a partitioning scheme such as a hash partitioning scheme and stored across the various database partitions that are connected together through the network. Thus, each database partition keeps a part of the overall database and various operations on parts of the database in different database partitions may be performed in parallel.
Data in the database is kept as tuples in relations that comprises of logically grouped attributes. Tuples and attributes in relations are also referred to as rows and columns in tables. An important feature of any database system, including shared-nothing systems, is the ability to process operations between rows in tables, such as relating data in one table to data in another table. These types of operations are particularly useful when querying related data from multiple tables such that the data needs to be retrieved in a meaningful way. For example, structured query language (“SQL”) supports what are known as join operations for querying data from multiple tables. Indeed, many consider the join operation as the cornerstone of database management systems.
A join operation may be expensive to perform in that it may require a large amount of processing and communication resources. One form of join operation in a distributed database system is the semi-join operation. A semi-join operation involves two tables R and S that are stored on different sites within a distributed system. A semi-join from R to S is performed by first projecting R on the join attributes at the site of R. The join attributes typically include one or more keys for selecting various tuples from tables R and S. Site R sends the resulting projection to the site of S, where a join operation is performed. The projection of R essentially acts as a filter of S, and thus, the result of the semi-join is usually smaller than S. Site S then sends to site R the semi-join result to complete the join operation. Hence, semi-joins can require creating and sending projections of one of the tables involved in the join across a network when the tables involved in the join are stored at different sites. This could be expensive and hence techniques to reduce semi-join processing are needed.
Using bit filters is one technique used to reduce the amount of communication during the execution of semi-joins. For example, bit filters can be used to reduce the size of the projection of the joining keys, and thus, reduce the amount of data communication needed. A bit filter operates by hashing the projection data into a set of bits. This set of bits serves as a compressed representation of the hashed data. Bit filters are also known as Bloom filters or Hash filters by those skilled in the art. Prior art that looked at semi-joins in the context of distributed systems essentially dealt with joining tables R and S located at two sites where each site contained a single table.
In general, data in tables in a partitioned environment may be distributed over multiple database partitions. One way to improve the performance of a database system is to divide the join operation or query so that each database partition can work in parallel with little inter-node communication. If data within two tables is partitioned in a similar way, a collocated join would be the most efficient join method. This can be done if the join attributes are the same as the partitioning attributes of the tables. No data would need to be moved to do the join. More often than not, this match of the partitioning keys and the join attributes is not possible and tuples of one or both tables need to be moved across the database partitions in order to perform the join operation. To minimize movement of tuples, bit filters might be used. In known systems, bit filters are broadcast between the various database partitions during various stages of a database operation. However, broadcasting bit filters themselves can consume a large amount of communication resources of the system. For example, when a particular database partition of one table is performing operations for a join, it must broadcast its updated bit filter to all the database partitions of the other table participating in the join. Some systems might use compressed bit filters to improve their efficiency. Unfortunately, even in a compressed form, bit filters can become a tremendous burden particularly when there are a large number of database partitions involved. Each database partition associated with one table requires enough memory space to store all the bit filters of all the other table's database partitions. In addition, database partitions may update their data frequently, and thus, may frequently need to broadcast/re-broadcast their bit filters. This can consume significant communication resources in the database system.
Therefore, it would be desirable to provide methods and systems that efficiently uses processing and communications resources when performing database operations. It would also be desirable, among other things, to provide methods and systems that efficiently use memory and data communications resources to perform various types of join operations across database partitions.