The present invention relates to data processing field, more specifically relates to a method and apparatus for generating cube metadata and generating query statements based on an enhanced star schema.
In the data processing field, online analytical processing (OLAP) is an approach to swiftly answer a multi-dimension analytical (MDA) query. OLAP is a broader classification of business intelligence, which also encompasses relational reporting and data mining. Databases configured for OLAP use a multidimensional data model, allowing for complex analytical and ad-hoc queries with a rapid execution time.
The core of any OLAP system is metadata schemas of OLAP cubes. With reference now to FIG. 13, a cube metadata schema of an OLAP cube is generally created according to star schema or snowflake schema of tables in the relational database, wherein measures are extracted according to records in fact tables and dimensions are extracted according to dimension tables.
FIG. 14 gives a specific example of the star schema. The star schema is a common multi-dimension data schema for presenting information and data in data warehouse. With reference now to FIG. 14, the star schema often consists of a fact table and some dimension tables around it.
Dimensions are literary descriptions of some point of view to business activities, and generally answers questions such as when, what product, where it is sold to and the like. Dimension data are stored in dimension tables. A dimension table may be divided into two parts: one part functions as ID of a primary key; the other is a set of characteristics fields of the dimension.
In addition, facts focus on data analysis in business activities and generally answers questions such as the unit price of a product, how many it is sold and the like. Fact data is stored in fact tables. A fact table may be divided into two parts: one part is foreign key fields pointing to primary keys of all dimension tables; the other part is measure fields.
With the ever-increasing complexity of the business, the star schema gradually faces a very tough challenge—factless fact table. A factless fact table is a fact table comprising no fact; that is, comprising no measure, and can be used for modeling many-to-many relationship among dimensions or capturing events.
Factless fact tables become more and more important, but its existence prevents independent data slice and dice from executing because a factless fact table is not taken into consideration in the star schema. At present, some methods, for example, Boolean Column, Multiple Columns, Bridge Table and the like can be adopted to convert factless fact tables when modeling for the purpose of being used in star schema indirectly. However, the conversion of a factless fact table using these methods causes some problems, such as more manual work needs to be introduced in conversion, dependence on knowledge of IT/business modeling staff, meanwhile additional resources are needed for storage and processing time of a query is increased at the database end.