1. Technical Field
The present invention relates to database querying, and more particularly to predicting query execution time.
2. Description of the Related Art
Predicting query execution time is an increasingly important problem, motivated by the current surge of providing databases as a service (DaaS) in the cloud. To a DaaS provider, query time prediction is crucial in many database management issues such as admission control, query scheduling, progress monitoring, and system sizing. However, given a database server hosting a relational database, is it difficult to predict the execution time of a search and query language (SQL) query before the query is executed.
Most existing solutions to this problem adopt statistical machine learning approaches to build predictive models. For example, in these machine-learning-based solutions, first a family of models (e.g., SVM regression, multiple linear regression, and KCCA) and a set of features (e.g., query plan structure and estimated cardinality) are handpicked. Second, a set of training data are collected by executing some sample queries. Then, the candidate models are tuned to fit the training data. Finally, the best tuned model (usually selected by using a separate set of validation data) is chosen as the solution. However, as is evident, the preceding solutions are not without deficiency. For example, while these approaches achieve some level of success, they suffer from several fundamental issues including poor performances on queries not seen during the training stage.