1. Field of the Invention
This invention relates in general to computer-implemented database systems, and, in particular, to estimating an amount of change in a data store.
2. Description of Related Art
Databases are computerized information storage and retrieval systems. A Relational Database Management System (RDBMS) is a database management system (DBMS) that uses relational techniques for storing and retrieving data. Relational databases are organized into tables which consist of rows and columns of data. The rows are formally called tuples or records. A database will typically have many tables and each table will typically have multiple tuples and multiple columns. The tables are typically stored on direct access storage devices (DASD), such as magnetic or optical disk drives for semi-permanent storage.
Some systems have very large databases, storing data on the order of terrabytes of information. With the growing use of computers and the increased types of data that is stored on a storage device (e.g., images and audio, as well as large amounts of text), such large databases are becoming more and more common.
As data in a database changes over time, from updates, deletes, and inserts of new data, it is usually necessary to perform maintenance operations on the database (e.g., to reclaim space, restore optimal clustering, make a full copy of the data, or collect statistics about the data that can be used to optimize access paths). Since these operations are time consuming, it is useful to perform them only when the amount of change has exceeded some threshold value.
Some conventional programs run the maintenance operations at particular intervals, without regard to the amount of change in the database. For example, certain maintenance operations are run once every 24 hours or once each 30 days. This conventional technique may unnecessarily perform maintenance operations on a database that has had little or no change.
Some conventional programs estimate change in a database by counting all record insertions and deletions and compare the total count to a threshold value to determine whether to perform maintenance operations. This conventional technique is inefficient, especially in large databases.
Thus, there is a need in the art for an improved technique for determining an amount of change in a database.
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a method, apparatus, and article of manufacture for a computer implemented technique for estimating an amount of change in a data store.
In accordance with the present invention, changes are identified in a data store connected to a computer. Initially, one or more interval changes are measured. Each interval change indicates an amount of change in the data store at an interval. Next, a data store change is estimated that indicates an amount of change in the data store across all of the intervals using each interval change.