The present invention relates to computerized database systems and in particular to a system that denormalizes relational data tables for faster query processing.
Database systems combine computer hardware and software to efficiently manage access to large amounts of data held in a database structure. An example database structure may hold one or more logical tables (relations) each with data elements organized in logical rows and columns. The columns normally define a common category (type or attribute) of data of the data elements and the rows (tuples) link related data elements of different types.
For example, a database of customers might have multiple rows each associated with a different customer and multiple columns holding different types of information about the customers such as: customer name, customer gender, customer address and the like.
Access to a database structure is commonly done by means of one or more “queries”, which may define a query condition (for example, as in an SQL WHERE clause). The query operates on multiple data elements of one or more columns across multiple database rows in a “scan” to identify the data elements relevant to the query. The application of the query to the data elements is termed a “scan” and provides a query result identifying data elements (for example, customer names) of selected rows meeting the query conditions. In the above example, a query might seek to identify the customer names of all customers having a particular gender and would return a query result listing individual customer names for particular rows.
Complex databases often use a relational database structure which relates data elements among multiple tables and linked by keys. This division reduces the amount of data that needs to be stored in a given table by moving some data repeatedly referenced by table rows to a second table. For example, in the above customer database, information about products purchased by the customer, for example, the country of origin or the price, may be stored in a separate table to be referenced when different customers purchase the same product. In this way the customer database table does not need to repeat detailed information about each product. Relational database structures, by substantially decreasing the amount of data that must be stored, can increase the speed of a query scan by reducing the amount of data that must be reviewed in the scan.
It is generally known that considerable speed gains in database scanning can be obtained by placing the database structure entirely or in part within high-speed random access memory. Relational database systems, by reducing the total amount of stored data, can make such “main memory” scans possible.