ETL (Extraction-Transformation-Loading) just indicates data extraction, transformation and loading. ETL extracts the data (such as relational data and planar data documents, etc.) from distributed and heterogeneous data sources into the interim intermediate layer for cleaning, transformation and integration, finally loads data to data warehouse or data mart used for online analysis. At present, all the existing ETL dispatching methods are non-memory and non-state dispatching methods, such as dispatching at fixed time point (fixed cycle). For example, one program is executed at 11:00 pm every night; there is not any correlation on dispatching layer between two dispatches; the judgment of task state and the selection of time cycle are fully completed automatically by the program logic of the called program which not only increases the burden of the called program, but also makes it be lacking in concentration on its own business logic.
The existing ETL dispatching methods are characterized by the followings:                Cyclic closed-loop extraction: The present ETL dispatching methods are all non-memory and non-state, and could only solve the ETL extraction at fixed time point (fixed cycle), but without the timestamp extraction (cyclic closed-loop extraction) in the ETL system.        Data re-extraction: The present ETL dispatching methods could not effectively solve the problem of automatic data re-extraction.        Acceleration of tasks with dispatching time lagged: If ETL task is suspended or falsely executed for some reason, and lags behind the preset plan; when the task is restored to the normal operation, it's unavailable to automatically accelerate the ETL task according to the characteristics of time cycle.        Self-assessment: The present ETL dispatching program could not carry out self-assessment according to the characteristics of ETL.        