The Central Processing Unit (CPU) cost of query execution is getting more critical in modern database systems, such as when slow disk accesses are largely avoided with the adoption of solid-state drive (SSD) devices. Just-in-time (JIT) compilation is an approach used to improve the CPU performance in a database system. JIT compilation refers to a compilation scheme or method in query execution performed during execution of a program, at run-time, rather than prior to execution. By producing query-specific machine code at runtime, the overhead of traditional interpretation can be avoided.
The effectiveness of JIT compiled query execution depends on the cost of the JIT compilation and the quality of the compiled code. Analytic tools such as Netezza and ParAccel dynamically generate C code for JIT compiled query execution, while tools such as Cloudera Impala and VitesseDB use a low level virtual machine (LLVM) intermediate representation (IR) builder to generate LLVM IR for JIT compiled query execution. In each case, online analytical processing (OLAP) workloads are targeted, where the OLAP workloads typically include large data sizes that benefit from the JIT compiled query execution. However, workloads with a smaller data size often result in a performance degradation with JIT compiled execution. Thus, often times the best solution depends on the data size of the workload weighed against the JIT compilation cost. Accordingly, a challenge with the JIT compiled query execution is to generate efficient code as well as to reduce the JIT compilation cost for a specific query.