1. Field of the Invention
The present invention generally relates to data processing and more particularly to processing queries configured to access data in a data repository.
2. Description of the Related Art
Databases are computerized information storage and retrieval systems. A relational database management system is a computer database management system (DBMS) that uses relational techniques for storing and retrieving data. The most prevalent type of database is the relational database, a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways.
A DBMS is structured to accept commands to store, retrieve and delete data using, for example, high-level query languages such as the Structured Query Language (SQL). The term “query” denominates a set of commands for retrieving data from a stored database. These queries may come from users, application programs, or remote systems (clients or peers). The query language requires the return of a particular data set in response to a particular query but the method of query execution (“Query Execution Plan”) employed by the DBMS is not specified by the query. The method of query execution is typically called an execution plan, an access plan, or just “plan”. There are typically many different useful execution plans for any particular query, each of which returns the required data set. For large databases, the execution plan selected by the RDBMS to execute a query must provide the required data return at a reasonable cost in time and hardware resources. In general, the overall optimization process includes four broad stages. These are (1) casting the user query into some internal representation, (2) converting to canonical form, (3) choosing prospective implementation procedures, and (4) generating executable plans and choosing the cheapest of the plans.
Optimization, and execution generally, can be a resource intensive and time-consuming process. Further, the larger the database, the longer the time needed to execute the query. From the end user's standpoint, the undesirable impact of query execution overhead is increased when a plurality of queries is executed. In many data mining and data query scenarios, it is often the case that the end user does not know, at the outset, the precise data they are after. In this scenario, the user typically issues a query, examines the results, modifies the query based on analysis of the results and then runs the modified query. In cases where the data being query is very extensive and complex, this can be a very time and resource intensive process, given the duplicative processing that takes place each time the user submits a new query.
Therefore, there is a need for a more efficient query execution method.