When using a declarative query language like SQL, a database client enables generation of queries that say ‘what to do’ with a database, but not ‘how to do it’. It is the task of a database management system (DBMS) to translate the client's declarative specification (query) into an operational procedure. When a client specifies a query like:
select lastName, firstName from emp where dept=‘software’
the DBMS parses and analyzes the query to form a high-level execution plan. This plan might consists of several steps:
1. Scan the set of records in the “emp” table
2. Restrict the set to just those records whose “dept” column value is ‘software’
3. Project the “lastName” and “firstName” columns from the resulting records
4. Return a set of <lastName, firstName> tuples
This sequence of steps will satisfy the client's SQL query in principle, but it is not executable code by itself. Many details have been left unspecified. The task of turning a query execution plan into executable code is typically accomplished in the interaction between two DBMS components: a code generator and a runtime execution engine.
A code generator takes a high-level execution plan, fills in details, and produces code that the runtime engine can execute. In some DBMSs, the runtime engine is a powerful, high-level interpreter of execution plans. In these systems, the code generator might merely need to provide some annotations of the execution plan, such as field offsets or names. In other DBMSs, the code generator might produce executable machine code that is dynamically linked and called by the runtime engine. Many DBMSs opt for a middle ground, where the code generator produces instructions in an intermediate “p-code”, which is interpreted by the runtime engine.
Systems where a code generator produces executable machine code for ad-hoc queries are very rare. One reason is that it is much easier to write and debug an interpreter. It is also true that DBMS source code is less “portable” to different computer platforms when it generates code targeted at a specific chip set. But a more compelling reason is that the performance advantages of executing compiled machine code in the runtime engine must be balanced against the time required to produce this code in the code generator. In environments where there are many small, short queries, the cost of producing executable code may be greater than the runtime benefit. In environments with I/O-bound queries (where the cost of evaluating the query is dominated by the time it takes to read records from disk), the advantage of saving a few CPU cycles by running machine code may be irrelevant, or too small to be worth the bother.