Within large corporate data warehouses, it is common practice to assign unique identifiers, or business keys, to logical entities to facilitate the sharing and integration of data across the entire corporation. Different data source technologies provide their own mechanisms that codify the concept of business keys within their products.
A business key is unique within a given context. For example, subsidiaries have identifiers unique within a corporation. In addition, contexts can be hierarchical in nature: Offices are unique within the context of cities, which in turn are unique within their state and country.
Different data source technologies provide their own mechanisms that assign business keys to data entities (referred to simply as entities in this document). Entities are uniquely identified in different ways, depending on the technology used to store or report on the data:                Relational databases use the unique combination of the value of one or more columns in one or more relational tables to identify entities. This concept also applies to dimensionally-modeled star or snow flake schemas, such as those typically used for data warehousing applications.        Online analytic processing (OLAP) technologies identify entities (a.k.a. members in OLAP terminology) by member unique names (MUNs). Each entity in an OLAP database is assigned a MUN that is unique within a specific scope, typically that of the entire database, referred to as a cube in OLAP technology. The format of MUNs changes from one OLAP technology to another.        
Business intelligence (BI) applications that author reports or analyses (list reports, cross tabs, charts) must be able to refer to entities as part of their report/analysis specifications, notably in the case of calculations and filter expressions.
To provide the expected inter-report behavior (e.g., drill-through), we must be able to convey entity values from one report/analysis (the source report) to another (the target report) while retaining the context (the business keys) of the values. In a business environment that uses a single, static data source storage technology, this is a simple task, because the storage technology applies a single convention to assign keys.
Similarly, in master/detail reports, in which a set of master records are associated with zero or more detail records, it is typical for the master and detail records to be returned by separate queries where the context for the detail query is provided by one or more values from the master query. Once again, if all data is stored in a single, static data source, the identity of each entry is easy to maintain.
Finally, the context provided by business keys is critical to reports that may be run more than once. For example, consider the common situation where a BI application involves authoring a report, saving it, and executing it at a later date. In a business environment, it is possible for the data source upon which saved reports were authored to change over time, but the saved reports must remain unaffected by changes to the underlying data source.
Challenges to the Current Paradigm
In a single-data-source environment, passing entities among reports, or even across the same report executed multiple times, is usually a simple proposition, because each entity typically has one and only one identifier or MUN. However, most business environments use multiple, dynamic data source technologies to contain their corporate data. When data is stored in multiple data source technologies, and when that data may change over time, reporting on reports presents several major challenges:                1. A single entity may have multiple identifiers or MUNs, most likely one per data source technology.        2. Data within the databases typically change over time as entities are added, deleted, and modified over time. Similarly, within a dimensional data source (relational or OLAP), an entity's position within a hierarchical ordering of entities may change over time. These changes can (and often do) cause changes to the entities' business keys and identifiers. In the presence of changing data, a saved report may contain references to entities that have changed over time and as a consequence, their identifier may have changed.        3. Disparate technologies (i.e., relational vs. OLAP) typically do not share data across the technologies very easily if at all.        
To execute a report or analysis in the context of another report or analysis (e.g., drill-through or master/detail) requires that an application convert an entity's business key between its representations in the different data source technologies. This is the only way to ensure that each report references the correct entity. This is an unmanageable burden for BI applications.
In the presence of changing data, as in an active business, a saved report may contain references to entities that have changed over time and as a consequence, their business key may also have changed. Requiring BI applications to resolve these entity references is again an unmanageable burden.
Finally, requiring applications authored against one data source technology to accommodate changes to another referenced data source technology is exceptionally difficult, and is also not a reasonable expectation of most businesses.
Upgrade Challenges
In an environment of multiple data source technologies, it is common for an entire data source technology to be replaced by another technology. In such occurrences, it is necessary for all BI applications authored against the original data source technology to continue to return the correct data without modification. This means all existing reports and analyses must be modified to accommodate the new data source, which is not a simple task, and is frequently prohibitively expensive for most businesses.
An alternative to this approach is to employ a piece of middleware that maintains a map of an entity's single, corporate identifier and all of its representations in all of the supported data source technologies. Though this does address the problem, it introduces its own set of problems:                The map must be constantly maintained and updated.        The map will grow over time as it may never be possible to remove old/deleted entity references.        Resolving an entity reference, or returning an entity reference in a result set, which occurs with a high frequency, requires a map lookup and imposes sometimes severe performance penalties:                    Even if each lookup in the map incurs only a small performance penalty, this overhead quickly becomes unacceptable with the large volumes associated with business applications.            If the map is stored in memory, it often consumes a very large portion of available memory.            If the map is stored even partially on disk, the impact on performance is increased.                        