A data warehouse is a database used for generating reports and data analysis. To facilitate reporting and data analysis functions, data is often transformed and organized in star schemas within a data warehouse. Populating the data within the data warehouse is done via ETL (Extract, Transform, Load) operations, which requires that the data warehouse maintain, in addition to the current state of the data warehouse, information about the last incremental data extractions obtained from the source tables. ETL operations propagate incremental changes made at the source tables into the star schemas of the data warehouse. ETL operations may transform the data prior to loading the data into the data warehouse. Examples of such types of transformation include data cleansing, data standardization, surrogate key generation, surrogate key replacement, unit of measure conversion, and currency conversion.
In Business Intelligence (“BI”) environments, custom data warehouse design, generation and population is often performed by specialist ETL developers and based on requirements manually gathered from potential warehouse users in a time-consuming fashion. The users describe the types of reports and dashboards needed for their use case(s), and the ETL developers take the information and perform their task, usually taking a long period of time.
Additionally, the process of creating the data warehouse model, deciding the frequency of ETL jobs, figuring out which warehouse data tables need to be kept in sync, maintenance, etc. is all performed manually. For one customer this is a laborious process, but for a Software as a Service (“SaaS”) provider who intends to service numerous customers for example via automated processes, this individual process is neither cost- nor time-effective.