This invention relates to the field of database and data analysis and more specifically, to a technique of incrementally determining a histogram for a table as new data values are added or some data values are modified.
Organizations and companies are continually gathering a great deal of information and data about their organization. Such data includes customer information, marketing information, financial numbers, engineering data, and much more. And this data is often stored in a database and these databases are continually being updated with new data as it is received.
Typically, organizations analyze the gathered data in order to determine how the organization is doing, efficiency, trends in the data, and many other statistics. An example of a statistic that is used to analyze data is the histogram. Histograms can be used for a cost-based optimizer to estimate selectivity of predicates seen in a query. As the amount of data becomes more voluminous, it generally takes more time to determine or derive statistics on the data.
Further, analyzing the data is a continual process and takes a great deal of computing time, since the analysis and statistics need to be updated with new data is received. Further, analysis may only be desired on a portion of the data, so that is can be compared against other portions of the data.
Therefore, there is a need for more efficient techniques of analyzing data and forming or deriving statistics on the data. This will reduce the amount of time needed to derive the statistics.