A relational database typically includes one or more tables. Each table has one or more records, and each record of a table has one or more fields. The records of a table are referred to as rows, and the fields are referred to as columns. Each column has associated metadata that describes the type, size, or other properties of the data in the field for each record. A schema includes the metadata for each column of each table, as well as other specifications of each table, such as a sort field, keys, or the like.
An extract, translate, and load system (ETL) is a computer-based system that extracts data from a specified data source, transforms the data to convert it into a desired state, and loads the transformed data to a specified destination. ETL systems may be used in a variety of environments. For example, a heterogeneous system may have some data stored in a first format, schema, or arrangement, and other parts of the system that use a different format, schema, or arrangement. An ETL system may be used to integrate the two subsystems. Transformations may include operations such as reformatting, sorting, filtering, combining data columns, or other types of modifications. Each of the input data and output data of an ETL system has a schema. The output schema may be the same or differ from the input schema.
The input and output schemas employed by an ETL are typically fixed. In order to accommodate changes in the schemas, a developer may modify the schemas as desired. In some systems, portions of a schema may be dynamically processed by an ETL. However, the dynamic processes may result in inefficiencies in the implementations of dataflows. For example, memory blocks may not be optimal for the particular dataflow.