In most enterprises, such as mobile communication operators, information is spread over many different data sources. It is not unusual that data stored in different sources is duplicated or at least means the same thing. When data that is expected to be the same, for some reason is not the same, unwanted problems can occur, for example, in a mobile communication network, the user of a mobile phone being unable to make a phone call, or the operator being unable to charge the customer etc. Thus, inconsistent data can cause a lot of trouble. According to investigations of the applicant, the average mobile communications operator revenue leakage is approximately 2% and a large part of this revenue leakage is a direct or indirect consequence of inconsistent data leading to ambiguous registrations of communications usage which therefore cannot be charged for.
There are tools on the market today that scan data sources in order to find data inconsistencies, or data deviations. One common problem for such tools is that the tools have to be instructed what to look for in the data sources. Since each data system comprises data sources in which data stored have their own data structure, the tool must be instructed for each data source combination that is to be scanned. I.e. the tool needs instruction about the data sources' data models and also how the data models relate to each other. However, different systems or sources may come from different vendors and it can be hard to get access to documentation that describes the data models. Another problem is that the data models are often so complex that even if someone knows or has access to the description of one data model it is hard to tell how it relates to another data model. Another issue is that for finding a data deviation, it may also be necessary to understand what is considered to be a deviation and what is not considered to be a deviation.
Also, data in data sources may change with time. In such cases, what is considered to be a data deviation may change in time.
Consequently, there is a need for a tool for efficiently detecting data deviations between data of different data sources.