The present invention relates to database systems and more particularly to online analytical processing (OLAP) systems.
Relational databases have become the dominant database technology for providing online transactional processing (OLTP) capabilities that are essential for businesses to keep track of their sales, transactions, and other affairs. Designed for efficient selection, storage, and retrieval of data, relational databases are ideal for housing gigabytes of detailed data in data warehouses. As a result, data warehouses contain a wealth of data that can yield, after analysis, critical information about a business, such as sales trends and product line profitability. This information can provide a crucial edge in an increasingly competitive marketplace.
The challenge is in deriving answers to business questions from the available data, so that decision makers at all levels in the business can respond quickly to changes in the business climate. While a standard transactional query might ask, xe2x80x9cWhen did order 84305 ship?xe2x80x9d, an analytical query might ask, xe2x80x9cHow do sales in the Southwestern region for this month compare with plan or with sales a year ago?xe2x80x9d The first question involves simple data selection and retrieval easily performed by relational database systems; however, the second question involves inter-row calculations, time series analysis, and access to aggregated historical and current data.
The data processing required to answer analytical questions is fundamentally different from the data processing required to answer transactional questions. This data processing is known as online analytical processing (OLAP). An OLAP application is a computer program designed to access the company""s raw data, process the data, and present the data to the company""s data analysts. Because of the sophisticated nature of this analysis, OLAP applications can be quite difficult to develop and implement.
For example, end users, such as the company""s analysts, think in the business terms, but the relational databases that store information for the data warehouses deal with such low-level technical concepts as tables and columns. Current data access methods for OLAP applications provide on a concrete view on the data in the data warehouse, forcing users to learn too much about the internal design of the database system that houses the data to analyzed.
As another example, the process of analyzing data is iterative. Users begin with a simply query, examine the result, modify the query slightly to highlight an element of interest, and then examine the new results of the modified query. This process of incremental modification is often repeated for many cycles. However, data access method in common use today employ textual query languages such as SQL (Structured Query Language), which requires the entire query to be resubmitted each time even though the bulk of the query is unchanged but with minor variations.
Furthermore, a logical starting point for analysis very often requires a large amount of data, e.g. millions of rows, which makes retrieval of the entire result set impractical, especially over a computer network. As a result, present OLAP applications are forced to constrain each query submitted to the data warehouse to limit the number of possible rows that can be returned to a manageable level. This approach also requires the submission of many similar queries, differing only in minor variations, to the data warehouse.
Another manner in which the business model differs from the relational database model is in the representation of the result sets obtained from the data warehouse in response to queries. Relational databases typically present a tabular representation of the result set, but business analysts prefer to see the data in a cross-tab representation, in which the contrasts and relationships between the factors of the data are more easily seen.
To the extent that current software addresses these issues, they are dealt with in the application tier. According, the OLAP application itself is programmed to model the business and translate between the business model and the physical database concepts as necessary. In this approach, the OLAP application has to be able to generate hundreds of similar queries and cache results. Furthermore, the results have to be formatted into a visual representation desired by the business analyst.
Programming these tasks results in very sophisticated and bulky OLAP applications, which are consequently unsuitable for any platform but the most powerful and expensive. Furthermore, bulky application logic is difficult and expensive to maintain and upgrade, making it difficult for business to adapt quickly to changing market conditions. As a result, there is an acute need for an OLAP solution that permits inexpensive, lightweight OLAP applications that can efficiently handle the incremental refinement of queries, management of large result sets, and lessen the mismatch between the business model and the relational database model.
The present invention addresses these and other vital needs by employing several features singly and in combination, including an object-oriented query representation, deployment of an OLAP data server apart from the OLAP application, and a multidimensional virtual cursor.
One aspect of the invention involves the specification and construction of queries in the OLAP application in an object-oriented representation rather than by a textual query such as SQL. The query objects resemble the business model of the OLAP application rather than the relational database model of the data warehouse. The execution of the query objects can be performed by a separate data server for providing the OLAP services, with the OLAP application holding remote references to the query objects. More specifically, the objects represent the query state, and the OLAP application refines the queries by invoking methods on the query objects. These actions cause corresponding methods to be invoked on the data server objects remotely. In this configuration, the data server can easily determine how the query has been altered and thus perform any conversions necessary between the object representation and the relational database""s own query representation without encumbering the OLAP application itself.
Another aspect of the invention relates to a multidimensional, virtual cursor, which presents a cross tab representation of the result set from executing query instead of a flat table. The OLAP application specifies the cursor size and then navigates to the cells that are to be displayed by specifying positions (such as Cartesian coordinates) on the edges of the multidimensional result set. Furthermore, the OLAP application is provided only with the cells that are really needed, which can be a tiny subset of the entire logical result set embodied by the query. In yet another aspect, subtransactions are employed to facilitate a what-if analysis and other kinds of query refinement, by allowing portions of queries to be selectively rolled back.
Accordingly, a system and method for analysis of data is described in which the complex task of translated between a business model and the relational database schema is performed by a data server. This approach greatly simplifies the responsibilities of the OLAP application, enabling OLAP application developers to focus on the business problem itself, thereby to deliver better applications faster. Because this translation is common to all application that use the analytic capabilities of the data server, the effort going into implementing the data server is leveraged effectively across many OLAP applications. Query refinement methods that execute in the data server reflect explicitly how a query is refined. Thus, the data server, rather than the application, chooses how to efficiently manage the query refinements. Furthermore, the implementation in the data sever is close to the data. Calculations are performed within the data server, and only the requested summary-level data need be returned to the OLAP application.
Still other aspects, features, and advantages of the present invention are readily apparent from the following detailed description, simply by illustrating a number of particular embodiments and implementations, including the best mode contemplated for carrying out the present invention. The present invention is also capable of other and different embodiments, and its several details can be modified in various obvious respects, all without departing from the spirit and scope of the present invention. Accordingly, the drawing and description are to be regarded as illustrative in nature, and not as restrictive.