IBM WebSphere Data Integration Suite comprises a job design tool that is used to design data flow between stages, known within the product as “Jobs” using a graphical user interface (GUI). An example GUI window 10 produced by the job design tool is shown in FIG. 1. GUI window 10 presents stages 12 and links 14 between the stages. Stages 12 represent data sources or targets, or data processing components, all collectively known as “stages”. The links 14 connect pairs of stages and represent the (one-way) flow of data between them. A link 14 has a stage 12 at either end; these ends are known as either “inputs” or “outputs” of the stage, depending on whether the stage is at the target or source end of a link 14. Data is deemed to flow along a link 14 as records, whose structure is declared via a set of “column” definitions attached to the link or, in some cases, the stage being traversed. Stages 12 and links 14 have user-assigned names, and types that are indicated via different icons.
Jobs, stages 12, links 14 and columns all have “Properties” that further define their behaviour. The job design tool allows the user to drag and drop stages 12 and links 14 onto a “canvas” that represents the overall job design; then to navigate the canvas, select a stage 12 by pointing at it, and open a properties editor that dives down into a stage's link-level inputs and outputs, and columns, to edit the various properties. In this way very complicated data flow graphs can be built up, containing several levels with large amounts of metadata.
A perennial issue for designers has been how to compare versions of a job design in a genuinely useful way. The current approach is to export a job's overall metadata as an XML representation, and use a standard XML-oriented diff tool to compare two XML documents generated from two copies of the job. The problem with this approach is that, except in trivial cases, there is insufficient context for the designer to compare the designs, and be able to distinguish between differences in structure or properties. Also, there is often unnecessary detail shown in terms of what has not been changed.