Data warehouses typically store substantial quantities of information, and further strive to facilitate expeditious querying and efficient retrieval of such stored data. In general, a data warehouse is a nonvolatile repository that can store a substantial amount of data, such as historical and/or current information—wherein the historical data can correspond to past transactional or operational information.
Accordingly, data warehousing and associated processing mechanisms (e.g., On-Line Analytical Processing (OLAP), Relational OLAP (ROLAP), Multidimensional OLAP (MOLAP), and Hybrid OLAP (HOLAP)) are widespread technologies employed to support business decisions and data analysis. Data warehouses are populated at regular intervals with data from one or more heterogeneous data sources, for example from multiple transactional or enterprise resource planning (ERP) systems. The aggregation of data can provide a consolidated view of an organization from which valuable information are derived.
Moreover, databases associated with such systems can organize and store data in a plurality of multidimensional tables, wherein the tables contain a series of rows (e.g., also referred to as records.) Each row provides particular information about a particular item such as a customer. Rows can further be divided into sections called columns. The intersection between a row and column is typically referred to as a field. Moreover, each field provides data or information related to a particular item. The columns specify the particular type of data provided in each field. For example, a table can be established for purchases of a particular product. Such table can include a plurality of rows corresponding to individual customers, and several columns for first name, last name, address, state, zip code, number of products purchased, price, date, and the like.
Database engines or management systems provide systems and mechanisms to retrieve and manipulate data from database tables upon specification of a query by a user. For example, OLAP describes category of technologies or tools utilized to retrieve data from a data warehouse. These tools can extract and present multidimensional data from different points of view to assist and support managers and other individuals examining and analyzing data. The multidimensional data model is advantageous with respect to OLAP as it allows users to easily formulate complex queries, and filter or slice data into meaningful subsets, among other things.
Multidimensional databases and other databases modeled as multidimensional employ a multidimensional query language such as MDX (MultiDimensional expressions) to retrieve and interact with data. More specifically, data can be retrieved utilizing the query language to define or describe specifically the data to be retrieved, for example employing select, where and from clauses. Subsequently, an execution engine receives the query definition and returns the requested data.
Typically, queries are processed in an operator tree, wherein data flows through such operator tree to calculate an answer for the query. Accordingly, queries against data warehouse and/or decision support systems commonly become complex, and hence require prolonged processing time for a vast amount of data. Moreover, the earlier that non-qualifying data are eliminated in such processes, the more efficient the complex queries can be executed.
In such schemas, a plurality of large fact tables (e.g., sales details) can be connected with multiple dimension tables (e.g., product information, sales region, and the like). Typical queries against such schemas consist of multiple joins of the fact table with several dimension tables. Moreover, database administrators typically employ such tables (and cubes in OLAP environments) to generate a report explaining shifts in total sales for different periods. As such, identifying data slices that can facilitate generation of accurate reports for tracking of transactions can be challenging.