1. Field of the Invention
The present disclosure relates to the field of databases, and specifically, to optimizing retrieval of data.
2. Description of the Related Art
Business Intelligence (BI) is a category of software that facilitates business enterprise decision-making and governance. BI provides reporting/analysis tools to analyze, forecast and present information; it also provides database management systems to organize, store, retrieve, and manage data in databases, such as Online Analytic Processing (“OLAP”) databases.
OLAP data sources and tools are a subset of BI tools. OLAP tools are report generation tools and are a category of database software. OLAP provides an interface for the user to limit raw data according to user predefined functions and interactively examine results in various dimensions of data. OLAP systems provide a multidimensional conceptual view of data, including full support for hierarchies and multiple hierarchies. OLAP tools are typically implemented in a multi-user client/server mode to offer consistently rapid responses to queries, regardless of database size and complexity. OLAP allows the user to synthesize information using an OLAP server that is specifically designed to support and operate on multidimensional data sources. The design of the OLAP server and the structure of the data are optimized for rapid ad hoc information retrieval in any orientation.
In terms of structure and volume, typical OLAP systems are organized as follows, with the upper structural elements acting as containers for the lower elements:                Cubes—tens per application        Dimensions—tens per cube, hundreds per application        Hierarchies—ones per dimension, hundreds per application        Levels—ones per hierarchy, hundreds to thousands per application        Members—hundreds to thousands per level, tens of thousands to millions per application        
In a BI Reporting System, the query performance of each user gesture must be as fast as possible. A cache memory, or cache, is commonly used to speed up the process. Specifically, a data cache may be used to store commonly used items of data (including metadata) so that the fetching of data from the data source is minimized. FIG. 1 shows a conventional BI system that more typically includes a data cache. This figure is described in further detail below.
The data cache may store various different types of data including metadata, which is understood to be “data about data.” For example, any instance of the above-mentioned OLAP structural elements, such as a “Geography” dimension or a “Canada” member, may be stored in a data cache. In fact, given that an OLAP data source may be a relational database organized as a star schema with dimension tables on the perimeter and for which every dimension member is stored as a row in a dimension table, members are often considered to be data rather than metadata. Accordingly, it is not necessary to distinguish a different type of cache for different types of data.
Different users of the BI Reporting System may have different security rights such that access to particular data (including metadata) may be restricted. These security rights are defined within the security subsystem of each OLAP data source that the BI Reporting System interfaces with. An OLAP data source's security subsystem is typically organized as a set of security profiles. Users are assigned to specific security groups, which could be nested and remain fixed even after users authenticate (log in) to the BI Reporting System. Users may also be assigned permitted security roles, and have the opportunity to select one or more of these roles when authenticating (logging into) the BI Reporting System. Each security profile associates one named security group or role with a specific set of access rights. OLAP systems may restrict access to data for various different security profiles, with different rules specified against different types of data.
When a user authenticates (logs into) the BI Reporting System and queries data from a specific OLAP cube, the combination of effective security profiles for that authenticated user and OLAP data source, is what defines this user's security context. For example, a user authoring a Time by Geography sales report (cross-tab) may not be entitled to see the results of all potentially available years and every geographic location (for example, countries); certain year and country members are therefore filtered out based on this user's security context (effective security profiles defining access to the Time and Geography dimensions).
A. State of the Art
When caching data, the BI system needs to know what data is common among users and therefore can be safely shared (seen by different users), and what data is not common, i.e. secure (not to be seen by certain types of users). To address this requirement, most BI systems currently implement one of two typical approaches. The first approach is to have system administrators copy and redefine the security rules (access rights) of the underlying OLAP data source, directly within the BI system, and have the BI system log into the data source as a single super user. This implies that filtering based on security context of each user needs to be performed prior and/or subsequent to any access to the cache in order to only return data that each user is entitled to view. A second approach is to have individual BI system users connect to the underlying OLAP data source using their own credentials, and to not share any cached data information between users. In general, the first approach is preferred for performance (uses less cache memory for the typically large volume of members), whereas the second approach is preferred for ease of maintenance.
B. Deficiencies in State of the Art
With the first approach where security rules are redefined in the BI system, every security profile needs to be maintained in two places: the BI system and the OLAP data source. In addition to being cumbersome, such security maintenance overhead is error-prone. On the other hand, with the second approach where multiple data caches are being created by the BI system (one cache per user), this makes for very inefficient usage of memory and, given the typically large volume of members, can prove to be quite costly in terms of performance.