In most enterprises, such as mobile communication operators, information is spread over many different data repositories. It is not unusual that data stored in different repositories is duplicated or at least means the same thing. When data that is expected to be the same is, for some reason, not the same, unwanted problems can occur, for example, in a mobile communication network, the user of a mobile phone being unable to make a phone call, or the operator being unable to charge the customer, etc. Thus, inconsistent data can cause a lot of trouble. According to investigations of the applicant, the average mobile communication operator revenue leakage is approximately 2% and a large part of this revenue leakage is a direct or indirect consequence of inconsistent data.
There are tools on the market today that scan data repositories in order to find data inconsistencies, or data deviations. One common problem for such tools is that the tools have to be instructed what to look for in the data repositories. Since each data system comprises data repositories in which data stored have their own data structure, the tool must be instructed for each data repository combination that is to be scanned. I.e., the tool needs instruction about the data repositories' data models and also how the data models relate to each other. However, different systems or repositories may come from different vendors and it can be hard to get access to documentation that describes the data models. Another problem is that the data models are often so complex that even if someone knows or has access to the description of one model it is hard to tell how it relates to another data model. Another issue is that for finding a data deviation, it may also be necessary to understand what is considered to be a deviation and what is not considered to be a deviation.
Furthermore, data in data repositories may change with time. In such cases, what is considered to be a data deviation may change over time.
Consequently, there is a need for a tool for efficiently detecting data deviations between data of different data repositories.