Response time and throughput are common measures used to evaluate the performance of a relational database system. In general, a relational database system's response time is a measure indicating the amount of time that it takes the system to respond to user instructions such as queries, and its throughput is a measure indicating the volume of user activity that the system is able to handle over a period of time. As an illustration of the response time measure, a relational database system that takes a relatively long time to retrieve and output data in response to a user query may be considered to have relatively poor response time and vice versa. As an illustration of the throughput measure, a relational database system that is able to complete only a relatively small number of user queries over a period of time may be considered to have relatively poor throughput and vice versa.
Because modern relational database systems often store massive and complex datasets, the response time and throughput of these systems can vary dramatically based on the way that data is stored, organized, and accessed. For instance, the response time of a relational database system may be unacceptably slow where data is organized such that each query requires the system to scan through vast amounts of non-requested data in order to retrieve requested data, or where data is organized such that each query requires the system to perform several logical operations to join, aggregate, or otherwise manipulate data that is dispersed throughout the system. Similarly, the throughput of a relational database system may be unacceptably low where data is stored and accessed such that successive queries often cause resource conflicts such as memory bus congestion.
In efforts to improve the performance of relational database systems, researchers have proposed a variety of techniques for storing and organizing data such that common queries can be efficiently completed without scanning through volumes of un-requested data or performing costly logical operations such as joins, aggregations, and so on. Examples of some of these techniques include storing related data in independent tables to limit the amount of data that has to be scanned in response to each query, and maintaining frequently accessed data in readily accessible materialized views or join indexes established by a database administrator.
Although many of the proposed techniques provide relatively efficient database performance, these techniques still tend to suffer from a variety of shortcomings such as increased administrative overhead, excessive redundancy of data storage, and some remaining requirement to perform costly logical operations when accessing data.