Computers are powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a business may have a database of employees. The database of employees may have a record for each employee where each record includes fields designating specific properties or information about any employee, such as, but not limited to the employee's name, contact information, and salary.
Between the actual physical database (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about the underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, all without user knowledge of the underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level. The general construction and operation of database management systems is well known in the art.
Increasingly, businesses run mission-critical systems which store information on database management systems. Each day more and more users base their business operations on mission-critical systems which store information on server-based database systems, such as SAP® Sybase® IQ (available from SAP AG of Delaware). SAP Sybase IQ is an analytics engine based on columnar database architecture, which stores data in columns. IQ is powered with a parallel bulk load engine. For non-partitioned tables, the load engine generates “final pages”, which are defined as normal data pages containing contiguous row IDs. Row ID (RID) contiguousness is crucial for fast data retrieval, because a column data projection algorithm used by the system is optimized when a page only includes consecutive row IDs, as opposed to row IDs with gaps. When operating with hash or hash-range partitioned tables, non-contiguousness of row IDs presents query inefficiencies. For a hash or hash-range partitioned tables, a typical page generated by the parallel bulk load engine includes regions of row IDs from different partitions. These pages are referred to as “intermediate pages” (IPs). Intermediate pages result in degraded query performance, because the column data projection algorithm has to constantly jump between different pages.
The features and advantages of embodiments of the invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings. In the drawings, like reference numbers generally indicate identical, functionally similar, and/or structurally similar elements. Generally, the drawing in which an element first appears is indicated by the leftmost digit(s) in the corresponding reference number.