Migrating data from one application to another is a very fundamental widespread business problem because the underlying formats and structures are different. In some respects, such data migrations between disparate applications is similar to translating from one natural language to another, e.g., English to Russian. While the context may be the same, e.g., “customer owes $25 per month”, the underlying representative structure using words, grammar, and sentences is very different. Simply comparing word-for-word across languages does not work. Moreover, simply expressing the tense or structure does not facilitate the process.
Additionally, there is little time to certify that a migration performed in a production environment is correct, even if the strategy has been tested in a test environment. Where large volumes of data are involved and sufficient test resources are unavailable, it becomes extremely problematic and, in most cases, impractical to manually inspect each record to confirm that the “intent” has been conveyed correctly.
Conventional techniques for checking data integrity during transmission include a cyclic redundancy check checksum, which is widely used to ensure that files are copied correctly. Other networking protocols also use a number of “redundant codes” to ensure correctness of data transmission. Data sampling for testing integrity is also widely used. However, in all of these cases, the data is not structured differently.
What is needed is architecture that can provide a high level of confidence that data transitioned from a first location to a second location or a first state to a second state can be reliably certified as being processed correctly.