This invention relates to a method and apparatus for integrating data transform test with a data transform tool such as one based on extract transform load (ETL) or extract load transform (ELT) architectures.
At its simplest level a data transform job is a process that will read data from one source (such as a database), transform it (for example, remove trailing spaces), and finally write it to a target (such as a file). In a large organization, a given data transform job environment may have thousands of jobs that are relied on to run the organization. Since such jobs have a critical nature there is understandably a significant investment in terms of both time and resource needed to ensure these jobs produce the correct results.
The typical life-cycle for a set of data transform jobs is to move from a development system where they are first created, to a formal test system where they are verified to be functionally correct, and finally to a production system where they ‘go live’ and are run as part of an automated schedule. Any subsequent modification to these jobs requires they go back to the development system, then re-verified on the test system, before they can be moved back on to the production system.
A reasonably sized project consisting of a few hundred jobs can require many months in a test phase and a lot of this time is spent simulating the production environment and where necessary sending jobs back to the development environment to fix defects. Every job that has a defect needs to be fixed and re-tested, often requiring downstream jobs to also be re-tested. The main downside of such an iterative development and testing cycling is the time it takes to verify all jobs are functionally correct and can therefore be moved into the production environment. Managing change control can also be a big problem. For example, if jobs that are in production need to be modified to cope with changes to business requirements then development and test cycle needs to be restarted. This creates delays in implementing the business changes and has the risk that unrelated functionality that is relied upon is inadvertently broken. Job developers will typically perform ad-hoc unit testing of the logic in their jobs before passing them over for a formal testing phase. This can be done by creating temporary copies of the job instead.