The analysis of structured relational data is well addressed with a variety of vendors offering an assortment of online analytical processing (OLAP) products. However, the new generation of data formats has significant differences from relational data, making the traditional OLAP solutions ill suited to analyze them. FIG. 1 illustrates the differences between relational data and extensible data. Typically, these new extensible data types support the interchange of information both inside and outside of a corporation. As such, they are complex, dynamic, subject to change, and highly heterogeneous. By contrast, relational data is designed to be used in a transactional environment and so are typically stable, schema-driven, unlikely to change, and highly homogeneous.
Business data volume growth exceeds our capabilities to analyze this data. An explosion in data types and data formats like HTML, XML, and others made the situation even worse by effectively preventing the traditional business intelligence application stack to operate on these new types of data in new formats. While traditional relational storage represents less than 20% of corporate data, the remaining 80% is getting increasingly more important, and businesses are trying to find ways to expand data analysis beyond relational data. There is a need to have a new generation analytical platform that can facilitate analysis of new data types and formats in an easy to use product featuring better and faster analysis, higher performance, and lower cost.
So, corporations have discovered that when they try to apply traditional analytical solutions to these new data types, a laborious and expensive step is required to “shred” them into a typical relational schema, losing the structure, flexibility and information that made them valuable in the first place. Any change comes hard and expensively. As these new extensible data types accumulate in corporations, there will be a demand to analyze them in a style and manner for which they were designed.
The industry has come full circle. In a way, the situation is reminiscent of the early 1990s when large volumes of relational data accumulated with the wide adoption of RDBMs, and vendors such as Arbor Software (Hyperion Solutions) and Business Objects introduced the concept of OLAP as an easy-to-understand way to analyze them. An analogous demand for an analytic platform will arise out of the appearance and ubiquity of these new extensible data types.
The characteristics of relational data shown in FIG. 1 are such that a traditional business intelligence (BI) application stack such as that shown in FIG. 2 works very well with relational data. However, the characteristics of the new extensible data, especially around its complexity, flexibility, and dynamic nature make it impossible to use that same BI stack and requires a new analytical platform. Thus, it is desirable to provide a new platform that handles the extensible data and it is to this end that the present invention is directed.