1. Field of the Invention
The present invention relates generally to systems and methods for analyzing and querying data. More particularly, the present invention relates to systems and methods for incrementally approximating a query result, optionally until the query result is produced exactly upon completion.
2. Description of the Related Art
Modern businesses increasingly rely on analyses of massive amounts of data. However, complex analyses and queries of large sets of data can be time consuming and expensive. Accordingly, many solutions have been devised for performing complex data analysis and queries faster and cheaper.
One way to provide a faster analysis of massive sets of data is to decrease query processing times by using more capable computer systems. Of course, computing resource capacity often comes at a steep price, which many organizations cannot afford.
One solution is to utilize certain statistical sampling techniques when processing and querying large sets of data. By creating and then querying a statistical sample of the data, a much smaller amount of data can be actually processed and then queried, thereby reducing the needed resources of the related computer system. Co-owned U.S. Published Patent Application No. 20030144868, the entire contents of which are incorporated herein by reference, describes a data processing, querying and analysis system that includes a statistical sampling function that decreases data processing and query times using statistical sampling techniques. In that system, complex processing, querying and analyses of massive amounts of data are performed. However, only a portion (i.e., a statistical sample) of a set of data larger than its dataset size limits is delivered to the portion of the computing system responsible for data query and analysis. This arrangement provides the advantage that less computing resources are required for querying and analyzing the set of data than if the entire set of data were processed and queried. Thus, that statistical sampling method saves computing resources, money and time. Of course, since the entire dataset is not made available for querying and analysis, the result provided, although accurate to sometimes acceptable levels, is not 100% accurate, unless the set of data being queried is smaller than the system's dataset size limits.
Therefore, there exists a continued need for new and improved systems and methods for processing, querying and analyzing data to save computing resources, money and time.