This invention relates to the field of database searching, and in particular to a system and method that provides a distributed cache for managing database query plans.
As technologies advance, the amount of information that is being stored in electronic form is ever increasing. Correspondingly, the search for particular information becomes increasingly more time consuming, while, at the same time, the expectation for rapid response increases.
Database management systems (dbms) are designed to organize data in a form that facilitates efficient search and retrieval of select information. Typical database management systems allow a user to submit a ‘query’ for finding and retrieving information that satisfies the query. Although a natural language interpreter may be the goal for developers of database management systems, users are generally required to use a ‘query language’ to submit such queries. Often, the user's query is a sequence of queries that are sequentially applied, with each query providing an increasingly finer filter for finding the desired information.
In a typical database management system, a query language interpreter processes each query, creates computer executable code, executes the code, then proceeds to process the next query. Such interpretation, however, may consume a significant amount of time, and the code produced may not be efficient in execution time or memory usage. In a distributed database management system, wherein queries are processed on multiple servers, this potential inefficiency occurs on each of the servers.
Most queries are unique, in that they are typically generated to solve a particular problem, to locate a particular information item, to create a particular grouping of information, and so on. Accordingly, each query is generally processed independently of prior queries.
Some database management systems allow a user to identify queries that may be used frequently, so that the generated computer executable code can be stored for reuse as required. The user may, for example, assign a name or other identifier to a particular query, then refer to that query using this identifier in subsequent queries. When the interpreter recognizes the identifier, it retrieves the code that had been previously created, avoiding the time and resources required to re-generate this code.
In some embodiments of user-definable reusable queries, the user is also provided the option of ‘parameterizing’ the query so that it can be executed using different arguments as the situation demands. For example, if the user typically queries a database for records having a ‘cost’ parameter with a value above a particular threshold value, the user may identify the query as “CostThreshold”, and identify the threshold value as an argument to this query. Thereafter, the user may submit a query such as “CostThreshold(100)” to find records having a cost value greater than 100. The identification and creation of such parameterized queries, however, typically require a level of expertise that may not be within the skill set of every user, or may not be considered by the user to be worth the time or trouble of creating such parameterized queries.
Even if the time savings provided by reusable code does not have a sufficient payback to warrant an individual user's time and effort to create the reusable code, the cumulative effect of having to regenerate the executable code for each query may be substantial, resulting in poor performance for all users, as the system spends more time regenerating code than in actually executing the code to satisfy each query.
It would be advantageous to reduce the time consumed in the execution of a user's query, or sequence of queries. It would also be advantageous to optimize the use of resources in the execution of such queries.
These advantages, and others, can be realized by a distributed collection of compiled programs. As each client submits a query, a parameterized query skeleton is identified, which identifies the general form of the query, and the parameters associated with the particular query. If a compiled form of the skeletal query is available within the distributed system, it is executed with the parameters of the query. If the compiled form of the skeletal query is not available within the distributed system, a compiled form is created, and the location of this compiled skeletal query is stored for subsequent access by this client, or other clients. The executable compiled skeletal queries may be stored at each client system, in a commonly available server storage system, or within one or more database servers. A routing system may be provided to efficiently route parameterized queries to the appropriate location(s) of the compiled skeletal query.
Throughout the drawings, the same reference numerals indicate similar or corresponding features or functions. The drawings are included for illustrative purposes and are not intended to limit the scope of the invention.