It is a well known problem in the computer domain today that data saved in data repositories may be inconsistent. In theory, all data in all data sources should be normalized. Although, in reality this is not the case. The same data may be saved in many places, in one and the same repository or in different repositories. If for example two data posts should be equal but one of them is wrongly entered, these data posts are inconsistent. In another example two data posts have initially equal value(s). If later on, one of the data posts is updated but the other is not, the data in these two data posts have become inconsistent. That data is inconsistent could mean that at least two data posts contradict each other. Also lack of data could be considered as an inconsistency.
To keep control of data inconsistencies, specific data analyzing tools are used to detect potential inconsistencies. Although, to search for and detect data inconsistencies may take a long time and may consume lots of computing resources.
For example, large mobile operators have data about their subscribers stored in a Home Location register (HLR). If such subscriber data from a large mobile operator is to be exported to a data sheet such as a Microsoft® Excel sheet, the sheet may be approximately 150 to 200 columns wide and as much as 100,000,000 rows deep. In order to detect all possible inconsistencies, all 100,000,000 rows must be analyzed by a data analyzing tool. It goes without saying that such an analysis may become quite heavy from a processing point of view. In the domain of mobile telephony, a redundancy control is often performed between a primary HLR and a secondary HLR to detect inconsistencies, such as redundancies and possible changes in data that may occur when e.g. performing a backup between the primary and the secondary HLR. Such inconsistencies may be changes in the primary HLR that has not been replicated to the secondary HLR. In such a redundancy control, all columns must be verified which could mean 15,000,000,000-20,000,000,000 comparisons. Of course, such a redundancy control takes a long time and requires lots of computing resources. In another, even more interesting example, there are many criteria that have to be fulfilled for knowing if there is an inconsistency, for example when data attributes in a data post are related to each other. In such cases there are also lots of comparisons that have to be performed, which uses much computing resources. Consequently, there is a need to perform inconsistency analyses faster and by using less computing resources.