1. Field
This invention relates to methods and systems for analyzing data, and more particularly to methods and systems for aggregating, projecting, and releasing data.
2. Description of Related Art
Currently, there exists a large variety of data sources, such as census data or movement data received from point-of-sale terminals, sample data received from manual surveys, panel data obtained from the inputs of consumers who are members of panels, fact data relating to products, sales, and many other facts associated with the sales and marketing efforts of an enterprise, and dimension data relating to dimensions along which an enterprise wishes to understand data, such as in order to analyze consumer behaviors, to predict likely outcomes of decisions relating to an enterprise's activities, and to project from sample sets of data to a larger universe. Conventional methods of synthesizing, aggregating, and exploring such a universe of data comprise techniques such as OLAP, which fix aggregation points along the dimensions of the universe in order to reduce the size and complexity of unified information sets such as OLAP stars. Exploration of the unified information sets can involve run-time queries and query-time projections, both of which are constrained in current methods by a priori decisions that must be made to project and aggregate the universe of data. In practice, going back and changing the a priori decisions can lift these constraints, but this requires an arduous and computationally complex restructuring and reprocessing of data.
According to current business practices, unified information sets and results drawn from such information sets can be released to third parties according to so-called “releasability” rules. Theses rules might apply to any and all of the data from which the unified information sets are drawn, the dimensions (or points or ranges along the dimensions), the third party (or members or sub-organizations of the third party), and so on. Given this, there can be a complex interaction between the data, the dimensions, the third party, the releasability rules, the levels along the dimensions at which aggregations are performed, the information that is drawn from the unified information sets, and so on. In practice, configuring a system to apply the releasability rules is an error-prone process that requires extensive manual set up and results in a brittle mechanism that cannot adapt to on-the-fly changes in data, dimensions, third parties, rules, aggregations, projections, user queries, and so on.
Various projection methodologies are known in the art. Still other projection methodologies are subjects of the present invention. In any case, different projection methodologies provide outputs that have different statistical qualities. Analysts are interested in specifying the statistical qualities of the outputs at query-time. In practice, however, the universe of data and the projection methodologies that are applied to it are what drive the statistical qualities. Existing methods allow an analyst to choose a projection methodology and thereby affect the statistical qualities of the output, but this does not satisfy the analyst's desire to directly dictate the statistical qualities.
Information systems are a significant bottle neck for market analysis activities. The architecture of information systems is often not designed to provide on-demand flexible access, integration at a very granular level, or many other critical capabilities necessary to support growth. Thus, information systems are counter-productive to growth. Hundreds of market and consumer databases make it very difficult to manage or integrate data. For example, there may be a separate database for each data source, hierarchy, and other data characteristics relevant to market analysis. Different market views and product hierarchies proliferate among manufacturers and retailers. Restatements of data hierarchies waste precious time and are very expensive. Navigation from among views of data, such as from global views to regional to neighborhood to store views is virtually impossible, because there are different hierarchies used to store data from global to region to neighborhood to store-level data. Analyses and insights often take weeks or months, or they are never produced. Insights are often sub-optimal because of silo-driven, narrowly defined, ad hoc analysis projects. Reflecting the ad hoc nature of these analytic projects are the analytic tools and infrastructure developed to support them. Currently, market analysis, business intelligence, and the like often use rigid data cubes that may include hundreds of databases that are impossible to integrate. These systems may include hundreds of views, hierarchies, clusters, and so forth, each of which is associated with its own rigid data cube. This may make it almost impossible to navigate from global uses that are used, for example, to develop overall company strategy, down to specific program implementation or customer-driven uses. These ad hoc analytic tools and infrastructure are fragmented and disconnected.
In sum, there are many problems associated with the data used for market analysis, and there is a need for a flexible, extendable analytic platform, the architecture for which is designed to support a broad array of evolving market analysis needs. Furthermore, there is a need for better business intelligence in order to accelerate revenue growth, make business intelligence more customer-driven, to gain insights about markets in a more timely fashion, and a need for data projection and release methods and systems that provide improved dimensional flexibility, reduced query-time computational complexity, automatic selection and blending of projection methodologies, and flexibly applied releasability rules.