Embodiments of the inventive subject matter generally relate to the field of databases, and, more particularly, to dynamically adjusting SMT (Simultaneous Multi-Threading) modes in parallel database systems.
Many applications access one or more databases as part of their operation. In general, database management systems are designed to provide a quick response to requests to store or retrieve information. For example, most databases allow for parallelization of query execution, especially for applications that require access to large amounts of data. Generally speaking, parallelism can be achieved using pipelined parallelism or partitioned parallelism. In pipelined parallelism, the output of one operation is streamed into the input of the subsequent operation, so the two operations can achieve some degree of overlap (parallelism). In partitioned parallelism, the input data is partitioned among multiple processors, so an operation can be split into parallel independent operators, each working on a part of the data. However, the benefits of pipelined parallelism often cannot be fully realized in conventional systems. For example, the benefits of pipelined parallelism can be limited because relational pipelines are rarely very long—a chain of length ten is unusual. Further, some relational operators do not emit their first output until they have consumed all their inputs. Aggregate and sort operators have this property. As a result, such operators cannot be pipelined. Still further, it is often the case that the execution cost of one operator is much greater than the others (this is an example of skew). In such cases, the performance improvement obtained by pipelining can be very limited.
Partitioned parallelism provides further opportunities to execute queries in parallel. In partitioned parallelism, the query is partitioned into units of work that may be worked on by multiple processors in parallel. However, current implementations fail to account for the variance in parallelism at different stages of query execution. Query execution speed suffers because a stage can complete only when the slowest thread for a query stage is completed.