Enterprise Information Management (EIM) refers to the processes and tools for managing and consolidating data. Data is often found in either data sources that lack the desired structure or data consistency or in multiple data sources with different structures and data consistency standards. To create a reliable version of this data, the data can be extracted, transformed, and loaded into either a physical or virtual target data source. The transform process may provide mapping logic or more complex logic to modify the data before it is consolidated. This target data source can then be used for BI, reporting, or other purposes.
The process of migrating data from a source (e.g., a database) to a target (e.g., another database, a data store, a data mart or a data warehouse) is sometimes referred to as Extract, Transform and Load, or the acronym ETL. ETL is a specific data transformation process. Extracting refers to the process of reading the data from a source (e.g., a database). Transforming is the process of converting the extracted data from its previous form into the form it needs to be in and cleansing it so that it can be placed in the target (e.g., a new database, data mart, or data warehouse). Transformation may include rules or lookup tables or combining the data with other data. Loading is the process of writing the data into the target.
The process of migrating from a source to a “virtual” data warehouse is sometimes referred to as EII (Enterprise Information Integration). EII is the process of selecting and combining data from multiple systems “real time”, without storing it on a disk enabling “on the fly” transformation in order to create a “virtual” data warehouse.
In the cases of both ETL and EII, it can be difficult for a user designing mappings to visualize the relationships between the data sources that supply the data to the target data warehouse or target virtual data warehouse.
Current technologies for visualizing these data operations tend to focus on data flow and transformation of the data within this process rather than on the specific relationships between the data sources. In a situation where multiple data sources supply the data for a target, the relationships between the data sources are not clearly illustrated through GUI (Graphical User Interface) displays. Although the data sources that are combined and the transforms applied to them may be displayed, it is generally not possible to determine the relationship, or absence of relationships, between data sources, in particular when there are multiple data sources. When data sources are combined at different levels/stages in the processing of the source data to construct the target data, it is not possible to easily view the relationships between data sources in prior art visualizations.
Often the tree structure of related data sources makes it difficult to determine dependencies without visual indicators of broken links. Prior art approaches also make it difficult to assess the depth of the links.
In view of the foregoing, it would be desirable to provide improved techniques for relating graphical representations of data tables.