The present invention relates generally to providing summarized data, and in particular to selectively exposing data to a data consumer using appropriate and/or different methods.
A common business need is to summarize data that exists in a system. This can be accomplished using a summarization program. Once a summarization program has completed a summarization run, the program can provide the summarized data to a data consumer (e.g., a tool for manipulating or reporting the data). Typically, a tool that consumes the data produced by a summarization program may consume data from tables such as fact tables (containing cost data, revenue data, etc.), dimension tables (containing task hierarchies, parent/child tasks, etc.), resource hierarchies (containing manager, employee, and coworker information), and so on. A data consumer receives or pulls this information and may perform roll-ups on the information.
Presently, many problems are met when providing data to a data consumer. For example, a data consumer may require information regarding a location of the data to be consumed ahead of time. In addition, a data consumer may not be able to pull data from multiple sources when the multiple sources represent the same functional data set but are differentiated by a technical concept such as the data volume they can contain. Also, a data consumer may have different methods for consuming data, but each method may have its own limitations. For instance, one consumption method may perform well with a small volume of data but may perform poorly with a large volume of data. Alternatively, a second consumption method may perform well with a large volume of data but may perform poorly with a small volume of data. Additional examples of previous solutions are discussed below.
In one previous solution, a data consumer is forced to consume data from a data source in whatever manner the data consumer chooses since the data consumer is not utilizing advanced information from the summarization program and must attempt to make its own optimizations. For example, the data consumer may impose requirements on the data source such as logging, events, flagging, etc. However, when the data consumer is designed to perform well on particular data volumes, rather than on particular data volumes extracted from a large data source, then a more specialized method as described herein can be introduced which can have better performance than existing methods.
In another previous solution, a metadata-based data consumer is based on a single table responsible for both bulk and incremental data. While refreshing an incremental amount of data, the data consumer is forced to determine whether to refresh all the data or to try to somehow determine the incremental context.
Generally, initial implementations of a data consumer begin with a bulk method followed by incremental methods thereafter. Often, if a large volume of data was to be subsequently consumed, the data consumer would have to be reset and an initial bulk summarization would have to be performed.