In database management systems (DBMS), including relational database systems, the retrievable data is typically stored in the form of tables. Typically each column of a given table-represents a particular data attribute while each row represents a specific record. The row/column combination, e.g., each field of a row, represents a particular data item. A user of a database accesses data from one or more tables by submitting a query to the respective DBMS. The DBMS typically responds to the query by constructing a particular view of the data stored in the tables and returning a responsive data set to the user.
In known large database systems, users often issue queries which require aggregation of data from large tables and/or which require aggregation of data from two or more tables. Responding to such queries can be time consuming. As a result, it is not uncommon for a database to include one or more summary tables, which contain selected subsets of data items, and/or aggregations of data items from one or more tables. A particular advantage of summary tables is that they contain pre-generated values, which allow the DBMS to more quickly access and retrieve queried data, a task that could otherwise be costly and time consuming.
One of the issues with database management systems which utilize summary tables is knowing which summary tables to create to most benefit system performance. Summary tables may be large and there may be many of them associated with a single database. As such, summary tables can take up considerable space in the database and large amounts of server CPU time may be required to generate and populate the summary tables on a regular basis. Thus, the convenience of using summary tables is usually balanced against the resource time expended in creating and refreshing the summary tables, as well as the additional storage space required to maintain them.
Consequently, database system administrators, those individuals tasked with administering a database management system, need to know precisely what summary table they should create and with what combinations of information. Ideally a system administrator would like to create only those summary tables that would make the best use of the system resources. Known current systems traditionally depend on someone to analyze user requirements and, based upon this analysis, decide upon a set of summary tables to satisfy the users' requirements. Thus, the system administrators spend considerable time analyzing user queries to determine which summary tables will provide the most benefit. Alternatively, system administrators may use ad hoc methods to guess which set of summary tables will best serve the system. While the latter approach results in less time to create the summary tables, it can result in summary tables being built and maintained that are not particularly useful or optimum. In any event, once those queries for which a summary table is to be built are identified, the DBA must then execute the requisite system instructions to build the summary tables. Finally, the use of tables in a database may vary over time, either because of new data or because of different query patterns. It is difficult for a system administrator to know when these changes have occurred, rendering a particular summary table unnecessary.
Thus, it would be advantageous to provide a system that will automatically tune itself according to system demands to achieve faster searching and retrieval of data while keeping system resource usage at a minimum. It would be advantageous to have an improved methodology for identifying data sets for which summary tables should be created and maintained for a given database and for determining when a summary table is no longer beneficial due to changed data or query patterns. It would be further advantageous to provide a mechanism to automatically create and remove summary tables from the database.
The present inventions solve these problems and more.