Superscalar central processing units (CPUs) that can execute more than one instruction per clock cycle are becoming more and more common for computing systems. Unlike pipelined architectures, superscalar architectures include multiple, redundant functional units that can operate on many instructions in parallel. Superscalar architecture and pipelining can be used together to provide even more CPU efficiency. Superscalar processing depends in part on the processor being provided with (or detecting on its own) instruction streams that are intrinsically parallel, meaning that the stream contains operations that operate on independent sets of data or in a way that order of execution between the operations will not lead to different results. This allows the processor to perform multiple operations at the same time.
Most database systems were implemented before superscalar CPUs started to dominate the market. Superscalar CPUs process data faster provided there are enough independent instructions inside small instruction windows (e.g., on the order of up to ˜100 instructions). In such cases, superscalar processors can detect enough independent operations to utilize multiple available CPU execution units. Independent operations are those with no data or control flow dependencies between them. Database systems often rely on optimizations that are no longer efficient for superscalar architectures. For example, a database implementation may include long functions with many conditional branches.
To take advantage of superscalar CPUs, databases need to improve data warehouse processing to achieve higher efficiency in processing for a majority of data warehouse specific data values. Current internal data representations do not lend themselves to efficient scalar processing. Database systems provide many data types, such as integers, strings, floats, binary blobs, and so forth that may each include a different type of internal data structure or other representation. Some of these are more appropriate for superscalar processing than others. Code paths have to be constructed with care to ensure very efficient processing and high density of independent CPU instructions, which often is not the case for database systems being used on superscalar processors. Various specialized data warehouse engines have been built to take advantage of superscalar CPUs, such as Monet DB/X100 and Microsoft Analysis Services. However, these engines are not generic relational database management system (RDBMS) engines and provide advantages only in limited situations that do not address superscalar issues with the database system core.