Embodiments of the present invention relate to analysis of data stored in databases, and in particular to generation of ranked insight for statistically valid combinations of measures and dimensions.
Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
A business problem may call for identifying relevant data within a large volume of a data warehouse, to perform analysis and create a solution strategy. Automated analysis tools may afford a user with statistical insights regarding such large data volumes.
Conventional analysis tools may provide analytical insight, but typically require frequent manual intervention by the user. This slows the process, and requires the user to exercise statistical knowledge in order to use the tool and interpret the insights provided.
To achieve favorable performance on these large datasets, many tools employ supporting intermediate physical tables. This involves maintenance with respect to durability, confidentiality, and integrity of the data.
Sequential generation of insights for each combination of data per user dataset, increases the response time with increase in number of datasets and the users. This situation may become more difficult with large datasets having many measures and dimensions.
Moreover, the overall ranking of insights may become invalid unless terms in the request inputs are considered. These terms are available only at runtime, hence pre-generated insights provided by conventional analysis tools may not be relevant to the request.
Accordingly, the present disclosure addresses these and other issues with systems and methods generating ranked insight for statistically valid combinations of measures and dimensions.