1. Technical Field
The present disclosure relates to data processing, and in particular, to analytic data processing.
2. Description of the Related Art
Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Data processing is a diverse field. Two types of data processing are transactional data processing and analytical data processing. Transactional data processing concerns transactional data, which represents the individual transactions that the transactional data processing system is managing; examples include order processing, invoice processing, etc. Transactional data processing is often performed using a database management system (DBMS)—more specifically, a row-oriented database system—which is often referred to as a “traditional” DBMS (in order to distinguish it from a non-traditional DBMS, which is not a row-oriented database system). Transactional data processing may be referred to by the acronym OLTP (online transaction processing).
Analytical data processing concerns analytical data, which represents collections of transactional data; examples include total value of sales over a time period, average daily balance of account receivables, etc. Analytical data processing may be performed using a traditional DBMS in which the data is periodically loaded (often as summary data) from a transactional data processing system in a process referred to as extraction, translation and loading (ETL). Often the ETL process is performed one per day, e.g. at night when the transactional processing system is lightly loaded. Analytical data processing systems may be referred to as data warehouses (DWs), business warehouses (BWs, for instances in which “data” is implicit or understood), business intelligence (BI) systems, data marts (DMs), etc. Analytical data processing may be referred to by the acronym OLAP (online analytical processing).
One reason why data processing is split onto two systems is that the analytical queries used for analytic data processing often “lock” all the data that they are operating upon, to prevent that data from changing mid-query; such locks may slow down the performance of the system for performing additional transactions. One downside of splitting data processing onto two systems is that, due to the periodic nature of the ETL process, the analytical data is out of date.
In an analytical data processing system, a core analysis concept is the analysis cube (also called a “multidimensional cube” or a “hypercube”). It consists of numeric facts called measures which are categorized by dimensions. The cube metadata is typically created from a star schema or snowflake schema of tables in a relational database. Measures are derived from the records in the fact table and dimensions are derived from the dimension tables. Each measure can be thought of as having a set of labels, or meta-data associated with it. A dimension is what describes these labels; it provides information about the measure.
The analysis cube is one of a number of data structures that are used to create data models (also referred to as just “models”). Models are generally used to analyze data, including ways to view data, ways to select data, etc. In general, a number of models are created in a particular analytical data processing system; these models correspond to frequently-performed data analysis operations on the particular data stored by the analytical data processing system. For example, the models used by Company X will differ from the models used by Company Y due to the differences in the structures of the underlying data of both companies.
Recent developments in in-memory technology have implemented in-memory databases in analytic data processing systems in place of the traditional DBMS. An example of an in-memory data processing system is the HANA™ system from SAP AG. An in-memory data processing system may perform both transactional and analytic data processing due to the speed available from storing the data in memory (as opposed to the disk storage of non-in-memory database systems).