Field of the Invention
The present invention relates to data caching database queries and more particularly to query processing in an in memory data grid (IMDG).
Description of the Related Art
Database query processing refers to the receipt and execution of data queries against a database. Flat file databases generally process queries in accordance with a key used to locate matching records and to return the matching records to the requestor. To the extent that data is to be culled from different related records, a series of queries are required to located different keys in different database tables so as to ultimately return the desired set of data. Relational databases improve upon flat file databases by permitting the logical joining together of different tables so as to execute a single query against the joined set of tables in order to produce a desired set of data.
An in memory data grid (IMDG) is a highly distributable form of a database that permits parallel processing across a set of disparately located computing devices. The use of an IMDG permits substantial parallelization of database operations and, in consequence, efficient utilization of unused processing resources in each host computing device supporting the IMDG. To the extent that data in the IMDG is highly distributed, relational database concepts cannot be effectively applied. Thus, though highly scalable, database operations in an IMDG are substantially granular and numerous in comparison to that of a traditional relational database.
Traditional database technologies, including flat file and relational database technologies make extensive use of caching to enhance the performance of database queries. As it is well-known, caching is a predictive concept in which previously retrieved data resulting from one or more queries can be stored in local memory and returned, when applicable, to a requestor without requiring the re-execution of the queries against the database. Typical algorithms for determining when data is to be cached (and also evicted from the cache) includes the most frequently used algorithm and the most recently used algorithm, to name two examples.
The IMDG, however, does not make use of traditional caching, principally because of the unpredictable number of nodes in the grid supporting the IMDG and the fragmented nature of data queries across the different nodes of the IMDG. Further, to the extent that numerous small queries can be required to achieve a single relational query in the IMDG, the process of caching and cache retrieval can become unwieldy. Finally, unlike the circumstance in a relational database where the ultimate data retrieved and returned to the requestor can be readily related to the underlying query, the cascade of requisite small queries to achieve a single query results in an IMDG can cloud the connection between individual, small query results and the ultimate desired query.