Organizations that store large amounts of data utilize database systems to manage that data. One type of database system is a data warehouse. A data warehouse is a collection of data that is structured to allow for analytical and reporting tasks. Such analytical tasks can provide decision makers with significant information. The structure of data within a data warehouse is in contrast to the structure of data within operational databases which are structured to support transactional operations for day-to-day business operations such as sales, inventory control and accounting.
An Extract, Transform, and Load (ETL) process is performed to transfer data that is formatted for operational tasks to data that is formatted for the analytical tasks associated with a data warehouse. This process involves extracting data from multiple sources. The data from these multiple sources may be formatted differently or include irrelevant details. Additionally, the data may have errors or inconsistencies that should be changed. Thus, the data will have to be transformed for data warehouse operations. Finally, the corrected and transformed data is loaded into the data warehouse.
One task of the ETL process is a surrogate key generation process. Objects within various sources of data such as customers are identified by production keys. For example, an object such as a particular customer may be identified by a production key such as a customer identification number. Furthermore, different sources of data may represent a single object using different production keys. For example, one source may represent a customer with a customer name while another source may represent that customer with a customer identification number. Thus, part of the ETL process is to replace each production key with a generated surrogate key so that all of the objects within the multiple sources of data are identified using the same key.
Throughout the drawings, identical reference numbers designate similar, but not necessarily identical, elements.