Databases are used to store information for an innumerable number of applications, including various commercial, industrial, technical, scientific and educational applications. As the reliance on information increases, both the volume of information stored in most databases, as well as the number of users wishing to access that information, likewise increases. As the volume of information in a database, and the number of users wishing to access the database, increases, the amount of computing resources required to manage such a database increases as well.
Database management systems (DBMS's), which are the computer programs that are used to access the information stored in databases, therefore often require tremendous resources to handle the heavy workloads placed on such systems. As such, significant resources have been devoted to increasing the performance of database management systems with respect to processing searches, or queries, to databases.
Improvements to both computer hardware and software have improved the capacities of conventional database management systems. For example, in the hardware realm, increases in microprocessor performance, coupled with improved memory management systems, have improved the number of queries that a particular microprocessor can perform in a given unit of time. Furthermore, the use of multiple microprocessors and/or multiple networked computers has further increased the capacities of many database management systems.
From a software standpoint, the use of relational databases, which organize information into formally-defined tables, and which are typically accessed using a standardized language such as Structured Query Language (SQL), has substantially improved processing efficiency, as well as substantially simplified the creation, organization, and extension of information within a database. Furthermore, significant development efforts have been directed toward query “optimization”, whereby the execution of particular searches, or queries, is optimized in an automated manner to minimize the amount of resources required to execute each query. In addition, a reduced reliance on runtime interpretation of queries in favor of increased usage of directly-executable program code has improved query engine performance.
Through the incorporation of various hardware and software improvements, many high performance database management systems are able to handle hundreds or even thousands of queries each second, even on databases containing millions or billions of records. However, further increases in information volume and workload are inevitable, so continued advancements in database management systems are still required.
One area where substantial performance gains may be achieved is query execution, e.g., the actual generation of result sets for optimized or unoptimized representations of queries. In this regard, the manner in which queries are represented and presented to an execution engine for processing, and the manner in which an execution engine processes those queries, can vary substantially, and can have a substantial effect on overall database performance.
In database management systems that incorporate query optimization, a query optimizer typically generates an access plan for a query that specifies one or more instructions to a query engine to enable the query engine to execute a particular query. In some database designs, query access plans are represented using interpretive code, which may require substantial processing overhead to interpret and execute access plan instructions. In other instances, query access plans may be assembled from blocks of executable code, with an interpreter used to select those blocks to be executed. In still other instances, query access plans may be represented using directly executable code, which provides comparatively lower processing overhead.
Despite the various manners in which the code used to represent a query access plan may be executed, in many instances, these query access plan representations are all relatively fixed in format, and as a consequence are difficult to adapt and otherwise extend to incorporate new functionality. Modifying and/or improving a query engine design is therefore problematic in many of these instances.
In still other database designs, in particular in some object-oriented database designs, query access plans may be represented as collections of objects serving as nodes in a tree data structure. Often, the representations of query access plans as collections of objects facilitates extension of a query engine architecture due to the ability to modify and extend various objects used to form a query access plan. However, in many instances, the individual objects retain a substantial amount of fixed program code, which limits the degree of extensibility, and thus hampers the ability of database designers to modify or improve a database engine design.
Given that continual refinements in a query engine design may be necessary to keep pace with the continual advances of others, it is highly desirable to provide an efficient, high performance query engine architecture that is also readily extensible and adaptable.