1. Field of the Invention
The present invention relates generally to electronic databases. More particularly, the present invention relates to storing and retrieving data from electronic databases.
2. Description of the Related Art
One general category of application software is often referred to as a database management program or simply, a database application. Encompassed within this general category are relational database and multidimensional database systems. Relational databases are typically transactional models optimized for transactional processing, e.g., write operations. Multidimensional databases or, in technical discussions, Online Analytical Processing (OLAP) data stores are better suited for problem analysis, e.g., reading and understanding data relationships. The OLAP paradigm is described in the white paper entitled xe2x80x9cProviding OLAP (On-line Analytical Processing) to User-Analysts: An IT Mandatexe2x80x9d by E. F. Codd, S. B. Codd, and C. T. Salley published by Codd and Date, Inc., and incorporated by reference herein for all purposes.
Typically, a multidimensional database stores and organizes data in a way that better reflects how a user would want to view the data than is possible in a spreadsheet or relational database file. Multidimensional databases are better suited generally to handle applications with large volumes of numeric data and that require calculations on numeric data, such as business analysis and forecasting.
A dimension within multidimensional data is typically a basic categorical definition of data in a database outline (discussed in greater detail below). A multidimensional database can contain several dimensions, thereby allowing analysis of a large volume of data from multiple viewpoints or perspectives. Thus, a dimension can also be described as a perspective or view of a specific dataset. A different view of the same data is referred to as an alternative dimension. A data management system that supports simultaneous, alternative views of datasets is said to be multidimensional. Using a business application as an example, dimensions are items such as TIME, ACCOUNTS, PRODUCT LINES, MARKETS, DIVISIONS, and so on. Within each dimension, there is typically a consolidation or other relationship between items.
To elaborate, FIG. 1 illustrates transaction tables in a relational database. A transaction table may represent the weekly sales of a product in a regional store. Table 100 contains data regarding the weekly sales of different size widgets in the regional store A. For example, in week 1, the sale for 5 ounce (oz.) widgets were $400, and in week 3, the sale for 10 ounce (oz.) widgets were $500, and so forth. Similarly, table 110 may contain data about the sales of widgets in another regional store, a regional store B.
FIG. 2 illustrates how the information available in relational database may be presented as multidimensional output using a multidimensional database. The information available in FIG. 1 is represented across dimensions of TIME and PRODUCTS, in a multidimensional output 200 illustrated in FIG. 2. For example, information can be analyzed and presented across a TIME dimension, e.g., to determine the total sales of all different sizes of widgets sold in all regions for different weeks, e.g., week 1, week 2, etc. Similarly, information can be analyzed and presented based on a PRODUCT dimension, e.g., total sales of different sizes of products in all the regions.
Given that relational databases are optimized for transactional processing and multidimensional databases are more suitable for analyzing and presenting data, it is desirable to provide for methods that allow generation of multidimensional output from relational databases. In other words, it is desirable to provide a bridge between the relational and multidimensional database models, whereby data available in a relational database can be accessed and analyzed to produce multidimensional output.
Unfortunately, conventional attempts to generate multidimensional output from a relational database have been inadequate. The conventional attempts have used, by-in-large, ad-hoc methods using implementation-specific approaches requiring a significant amount of time and resources. For example, implementation-specific programming scripts, rule files, etc. have been developed to enable a conventional multidimensional database to access data in a relational database. Thus, to access a particular relational database, a significant amount of work would be required, and accessing a different relational database would require developing more implementation-specific programming scripts, rule files, etc.
Furthermore, with conventional approaches, every time there is a change in the relational database, more work is required to implement these changes so that data can be accessed and used to generate multidimensional output. For example, if a new product, a 12 oz. widget is added to the relational database, there is a need to implement more changes, e.g., write new or modify existing programming scripts, rule files, etc. This is, in part, attributed to the fact that in multidimensional models data can become part of the structure that represents the database. By way of example, the structure of PROUCTS represented in FIG. 2 needs to be modified to reflect the change, e.g., addition of a new product to the relational database. The modified structure is illustrated as structure 210 in FIG. 2.
In view of the foregoing, there is a need for an abstract modeling of data to provide improved methods for generating multidimensional output from relational databases.
Broadly speaking, the invention relates to methods, apparatus, and data structures suitable for storing and retrieving data from databases. In one aspect, the invention pertains to providing methods for accessing a source database to create an outline that can be used to generate multidimensional output suitable for presenting several aspects (dimensions) of an analytical problem. The information used to solve the analytical problem is typically maintained in the source database. The outlines created from data in the source database can be represented as Extensible Markup Language (XML) files.
In accordance with one embodiment of the present invention, a multidimensional accessing system suitable for accessing a source database is disclosed. The multidimensional accessing system includes a Meta-data manager, Meta-data modeler, and a Meta-data-outliner. The source database can be a relational database that is accessed by a variety of conceptual accessing techniques (e.g., SQL). The Meta-data manager can access the source database, as well as interact with the Meta-data modeler and Meta-data outliner. The Meta-data modeler can be used to define one or more Meta-model. The Meta-data outliner can be used to create the Meta-outlines. As will be appreciated by those skilled in the art, the Meta-model and Meta outline, together, provide a mechanism for describing the semantics of relational and OLAP models in a way that bi-directional access to physical structures in data bases can be accomplished without having to rely on xe2x80x9chard-codedxe2x80x9d data values.
The invention can be implemented in numerous ways, including a system, an apparatus, a method, or a computer readable medium. Several embodiments of the invention are discussed below.
As a method of accessing data from a source database to create a Meta-outline, an embodiment of the invention includes the acts of: defining an application related to one or more dimensions of data associated with the source database; defining a Meta-model for the application, the Meta-model relating to the one or more dimensions of data associated with the source database. The Meta-outline can be used to generate multidimensional output providing a solution to a problem relating to one or more dimensions of data associated with the source database.
As a data accessing system capable of accessing a source database to create a Meta-outline which can be used to produce multidimensional output, one embodiment of the invention includes: a Meta-data modeler suitable for defining a Meta-model for the source database; a Meta-data outliner for creating a Meta-outline for the Meta-data, the Meta-outline suitable for generation of multidimensional output relating to one or more dimensions of data associated with the source database; and a Meta-data manager suitable for interacting with the Meta-data modeler and the Meta-data outliner, the Meta-data manager providing an interface which can be used to access the Meta-data modeler and the Meta-data outliner.
This invention has numerous advantages. One advantage is that multidimensional output can be generated efficiently from data available in a source database with a relational format. Another advantage is that this invention allows for dynamic generation of instructions necessary to generate multidimensional output without requiring pre-programming and pre-storing instructions. Yet another advantage is that relational to OLAP mapping can be accomplished by defining abstract models. Accordingly, relational to OLAP mapping can be accomplished without relying on data patterns, thereby allowing for more dynamic database models. Still another advantage is that the adverse risks associated with translation errors are significantly reduced. The invention also allows for the representation of multidimensional output in the form of XML files, thereby greatly facilitating the transfer of data from one system or platform to another.
Other aspects and advantages of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.