“Extract-transform-load” data integration jobs are known. Roughly speaking, Extract-Transform-Load (ETL) refers to a process in database usage, and especially in data integration, that involves: (i) extracting data from outside source database(s) (see definition, below, in Definitions sub-section of detailed description section); (ii) transforming the extracted data to fit operational requirements (for example, quality levels); and (iii) loading the transformed data into the target database (see definition, below, in Definitions sub-section of detailed description section). During the transform phase, the data being transformed to appropriate form and format for the target database(s) is subject to validation based upon validation rules. If validation fails, it may result in a rejection of the data, such that an incomplete data set proceeds to the load phase. These validation failures are called exceptions. One example of an exception encountered during validation is when a code translation parses an unknown code in the extracted data. At the time of validation, the range of data values or data quality in the source and/or target database(s) may exceed the expectations of designers. Data profiling of a source database during data analysis can identify the data conditions that will require management by the transform rules. Data profiling can also lead to revisions in the validation rules implemented in the ETL process.
Typically an ETL Process (that is, unit of work) is designed to accomplish the following: (i) extract and cleanse the data from the source database; (ii) transform the data into a desired format that can be consumed in the subsequent extraction phase; and (iii) loading the data to a target database. Typically, transform phase (ii) applies the core business logic to convert data into information. Subsequent to load phase (iii), the data of the target database is used by a reporting engine for deriving insights out of the transformed data. There are two complete sets of Life Cycles that a job in ETL process undergoes: (i) porting/migration/upgrades of jobs from an older version to the newer version of the ETL product; and (ii) movement of jobs from development to quality assurance to production, which is typically movement across the same version.
Exception handling is the process of responding to exceptions that occur during computer processing. Exceptions are anomalous or exceptional events requiring special processing, sometimes changing the flow of program execution. Exception handling is typically provided by specialized programming language constructs or computer hardware mechanisms. In general, an exception is resolved by: (i) saving a current state of execution in a predefined location; and (ii) switching the execution to a specific subroutine known as an “exception handler.” On condition that an exception is “continuable,” the handler may later resume the execution at the original location using the saved information. Alternative approaches to exception handling in software include: (i) error checking (maintains normal program flow with later explicit checks for contingencies reported using special return values, and floating point status flags); and (ii) input validation (preemptively filters exceptional cases).