The present invention relates to portable workload performance prediction for the cloud.
Offerings for DBMS users now come in a variety of hardware and pricing schemes. Users can purchase traditional data centers, subdivide hardware into virtual machines or outsource all of their work to one of many of cloud providers. Each of these options is attractive for different use cases. The infrastructure-as-a-service (IaaS) has grown in popularity, in which users rent virtual machines, usually by the hour. Major cloud providers in this space include Amazon Web Services and Rackspace. It is well known that by deploying in the cloud, users can save significantly in terms of upfront infrastructure and maintenance costs. They benefit from elasticity in resource availability by scaling dynamically to meet demand.
Past work in performance prediction revolved around working with a diverse set of queries, which typically originate from the same schema and database. These studies relied on either parsing query execution plans to create comparisons to other queries or learning models in which they compared hardware usage patterns of new queries to those of known queries. These techniques are not designed to perform predictions across platforms; they do not include predictive features characterizing the execution environment. Thus, they require extensive re-training for each new hardware configuration.