Data that are operated upon by computing devices may be stored in a database system. In a relational database system, data are logically stored as records in rows of relational tables. Each relational table typically contains one or more columns. For each record, the values of the fields of that record are stored in separate columns of one or more relational tables. For example, a database might contain, among other tables, an “employee” database table that contained columns for “name,” “address,” “SSN,” “birthdate,” “department,” and “salary” fields. A particular row in the “employee” database table might contain values such as “Bob Smith,” “123 Sycamore Street,” “132-69-8475,” “12/08/1941,” “accounting,” and “$115,000”—each such value being stored in and corresponding to a separate one of the columns mentioned.
In order to retrieve selected data from one or more relational tables, a user may formulate and submit a query to a database server. The query specifies some criteria by which the database server should select data from among all of the data contained in the database system. For example, the query may specify certain tables and columns. The database server receives and executes the query. In executing the query, the database server forms a result set and returns the result set to the user from whom the query was received. The result set is a selected subset of all of the data in the database system. The result set consists only of the data that satisfied the criteria specified in the query. Often, queries and the criteria indicated therein are specified via a formal query language such as Structured Query Language (SQL).
Some database systems provide a feature called a “materialized view.” A result set produced by a query may be thought of as a “view” of selected data in the database system. A result set produced by a query may be stored in a persistent manner for quick access later. A result set that is stored in such a persistent manner, or that is defined by a query that produces the result set, is called a “materialized view.” When a result set is stored in a persistent manner, it may be possible, under some circumstances, to avoid re-executing the query that produced that result set every time that the result set needs to be accessed. For example, if the underlying data from which the result set was generated have not changed since the result set was last calculated, or if the only parts of the underlying data that have changed are parts that are irrelevant to the query, then the result set itself will not have changed either. Thus, a “materialized view” is a persistently stored result set that is defined by a query. A materialized view persists even after the data contained therein are displayed or otherwise produced, such that the query defining the materialized view does not need to be re-executed in order to make the data available for later display or other production. In some database systems, materialized views themselves can be queried in the same manner that regular relational tables can be queried.
By defining materialized views in a database system, the re-execution of queries that define those materialized views can often be avoided, thereby conserving processing resources. When the execution of a query can be avoided by making reference to that query's materialized view instead, data access times can be improved as well. However, materialized views consume memory and storage resources. Every computing system's memory and storage resources are finite. Therefore, in every computing system, no matter how massive, there is a limit to the number and size of materialized views that can be concurrently maintained. Although it might be desirable to maintain a very large set of materialized views in order to conserve processing resources and reduce data access times, practical considerations force database system administrators to select, manually, from among a large pool of desired materialized views, a smaller set of materialized views that will be maintained in a database system.
Unfortunately, it can be difficult for a human database system administrator to make such a decision, especially when the quantity of materialized views from which the administrator must choose is very large. Additionally, because the state of the data in a database is often in constant flux, a human database administrator might be tasked with repetitively and frequently choosing new materialized views to maintain and previously maintained materialized view to abandon to make storage space for the newly chosen materialized views. Such a task can be overwhelming. Even if the state of the data in a database is relatively static, the workload sent to the database might shift over time. A change in workload (comprising queries sent to the database over time) often figures into a human database administrator's decision regarding which materialized views should be maintained and which materialized view should be abandoned.
In “Automated selection of materialized views and indexes for SQL databases,” by S. Agrawal, S. Chaudhuri, and V. Narasayya (in Proceedings of the International Conference on Very Large Data Bases, 2000), an automated materialized view selection approach is discussed. This approach seeks to build a predictive model of various relevant factors and perform a combinatorial search over the model to identify a set of materialized views that does well according to some mathematical objective. Unfortunately, the models may be inaccurate, leading to poor predictions. Additionally, sufficient information about the data underlying the materialized views, and/or about the workload that will be performed relative to the materialized views, is often unavailable, making it difficult to build a good predictive model.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.