For the modern enterprise, maintaining data consistency with respect to data originating from a variety of data sources is strategically important to the enterprise. This requirement may be achieved by implementing a data warehouse. To that end, SAP's Business Warehouse (BW) system consolidates data (e.g., external and the internal sources of data) into a single repository. Moreover, the BW provides preconfigured data and methods to aid a business enterprise when dealing with data management and archiving.
One aspect of the BW is the infocube (physically represented as a “star” schema or a “snowflake” schema). Infocubes are multidimensional data storage containers for reporting data and for analyzing data.
FIG. 4 depicts an example of an infocube. The infocube is a database framework (or architecture) including a central database table (referred to as a “fact table”). The fact table may include so-called “key figures” representative of data of interest. The fact table may be surrounded by associated “dimension” tables. The dimension tables include references pointing to master data tables including so-called “characteristics” assigned to the key figures. A dimension table may be used as a simple grouping of characteristics that do not necessarily have hierarchical dependencies. For example, characteristics that logically belong together (district and area, for example, belong to a regional dimension) may be grouped together in a dimension. By adhering to this design criterion, dimensions are largely independent of each other, and dimension tables remain small with regards to data volume, which may be desirable for reasons of performance.
Frequently, customers want changes to a BW system (e.g., SAP's NetWeaver BI) to be tested on actual, production data. Testing may, in some cases, be performed by either in-house personnel (e.g., staff employed by the customer) as well as external personnel (e.g., by a hardware or a software vendor or a third party consultant). In all cases, using actual, production data in testing may result in sensitive information (e.g., personal medical information, financial information, and the like) being provided to people who should not have access to the sensitive data.
To prevent the disclosure of sensitive information, one approach is to require everyone handling the actual, production data to sign a confidentiality agreement prohibiting disclosure of the sensitive information. However, even with confidentiality agreements, one cannot guarantee that the actual, production data is kept in confidence. For example, an inadvertent disclosure of information during testing would result in the compromise of the sensitive information. Alternatively, instead of using actual, production data, artificial data may be used during testing. But using artificial data may limit the effectiveness of testing. Accordingly, there continues to be a need to provide mechanisms to provide meaningful test data during testing.