Relational databases have become the dominant database technology for providing online transactional processing (OLTP) capabilities that are essential for businesses to keep track of their sales, transactions, and other affairs. Designed for efficient selection, storage, and retrieval of data, relational databases are ideal for housing gigabytes of detailed data in data warehouses. As a result, data warehouses contain a wealth of data that can yield, after analysis, critical information about a business, such as sales trends and product line profitability. This information can provide a crucial edge in an increasingly competitive marketplace.
The challenge is in deriving answers to business questions from the available data, so that decision makers at all levels in the business can respond quickly to changes in the business climate. While a standard transactional query might ask, “When did order 84305 ship?”, an analytical query might ask, “How do sales in the Southwestern region for this month compare with plan or with sales a year ago?” The first question involves simple data selection and retrieval easily performed by relational database systems; however, the second question involves inter-row calculations, time series analysis, and access to aggregated historical and current data.
The data processing required to answer analytical questions is fundamentally different from the data processing required to answer transactional questions. This data processing is known as online analytical processing (OLAP). An OLAP application is a computer program designed to access the company's raw data, process the data, and present the data to the company's data analysts. Because of the sophisticated nature of this analysis, OLAP applications can be quite difficult to develop and implement.
For example, end users, such as the company's analysts, think in the business terms, but the relational databases that store information for the data warehouses deal with such low-level technical concepts as tables and columns. Current data access methods for OLAP applications provide a concrete view on the data in the data warehouse, forcing users to learn too much about the internal design of the database system that houses the data to be analyzed.
As another example, the process of analyzing data is iterative. Users begin with a simple query, examine the result, modify the query slightly to highlight an element of interest, and then examine the new results of the modified query. This process of incremental modification is often repeated for many cycles. However, data access methods in common use today employ textual query languages such as SQL (Structured Query Language), which requires the entire query to be resubmitted each time even though the bulk of the query is unchanged but with minor variations.
Furthermore, a logical starting point for analysis very often requires a large amount of data, e.g. millions of rows, which makes retrieval of the entire result set impractical, especially over a computer network. As a result, present OLAP applications are forced to constrain each query submitted to the data warehouse to limit the number of possible rows that can be returned to a manageable level. This approach also requires the submission of many similar queries, differing only in minor variations, to the data warehouse.
Another manner in which the business model differs from the relational database model is in the representation of the result sets obtained from the data warehouse in response to queries. Relational databases typically present a tabular representation of the result set, but business analysts prefer to see the data in a cross-tab representation, in which the contrasts and relationships between the factors of the data are more easily seen.
To the extent that current software addresses these issues, they are dealt with in the application tier. Accordingly, the OLAP application itself is programmed to model the business and translate between the business model and the physical database concepts as necessary. In this approach, the OLAP application has to be able to generate hundreds of similar queries and cache the results. Furthermore, the results have to be formatted into a visual representation desired by the business analyst.
Programming these tasks results in very sophisticated and bulky OLAP applications, which are consequently unsuitable for any platform but the most powerful and expensive. Furthermore, bulky application logic is difficult and expensive to maintain and upgrade, making it difficult for business to adapt quickly to changing market conditions. As a result, there is an acute need for an OLAP solution that permits inexpensive, lightweight OLAP applications that can efficiently handle the incremental refinement of queries, management of large result sets, and lessen the mismatch between the business model and the relational database model.