Certain types of data warehouse services are designed to handle large amounts of data for analytical purposes but only for handling a few queries at a time. In such services, it might be necessary to deploy a multitude of database clusters in order to meet service level agreements (“SLAs”) and provide response times demanded by users. In these configurations, each of the database clusters can be configured to serve a specific purpose to maintain consistent performance and delivery of data. In order to configure each of the database clusters for operation, it might also be necessary to deliver a core set of database tables to the database clusters on a periodic (e.g. daily) basis.
Several technical problems can result from the highly distributed data warehouse architecture described above. First, because each database cluster is designated for performing a specific purpose, a significant percentage of the database clusters can sit idle a significant percentage of the time. Moreover, because each database cluster is designated for performing a specific purpose, it can also be difficult to load balance queries among the database clusters. Additionally, it can also be difficult to add or re-size existing database clusters.
Second, when delivering database tables to database clusters such as those described above, a per-cluster pipeline can be utilized to execute various activities to load the database tables from a source system to the database clusters. Despite data quality checks and auditing mechanisms, it is possible that problems with input data is only discovered after all of the data has been loaded to the database clusters. As a result, it might be necessary to re-run a large number of activities to correct the problem. Depending upon the number of corrupted input data files received from the source system, this might result in hundreds or even thousands of jobs to re-run. This can be a time consuming and highly error prone process.
The disclosure made herein is presented with respect to these and other considerations.