Embodiments may generally relate to methods and systems of an extract transform load process. More particularly, some embodiments are concerned with providing efficient adaptive caching for high volume extract transform load (ETL) processes.
An ETL process may be used to move data from one or more databases, transform the data, and load the transformed data into one or more different databases. The ETL process may be done for a variety of purposes, including, for example, analytics, improving reporting query performance, and data aggregation. A Manufacturing Execution System (MES) is a software system that may be used to control production activities in a factory environment. MES applications may be used to support, for example, real-time production control and data collection and reporting.
In some instances, one feature of ETL processes in a MES environment may be a requirement to look up a value based on a “key”, including retrieving a natural key given a foreign key. As used herein, a key is a value that uniquely identifies something or is otherwise associated with another entity. For example, a person's social security number may be a key to identify a complete record profile of an individual in a database. In this manner, a query using keys may be executed against the database to retrieve records specified in the query. The process of retrieving values based on keys may happen many times for each inbound row in an ETL process, resulting in frequent database accesses to look up the key-value pairs.
It is not uncommon for an ETL process to invoke millions of key-value pair lookups. In an ETL process where millions of rows are processed, each lookup may result in a database access. Given that source databases may typically be distributed on or over remote computers, each database access may incur the overhead of a network roundtrip that may have dramatic negative effects on the overall speed of the ETL process.
As such, there exists a need for a system, method, and computer executable program that facilitates efficient ETL processing.