1. Field
Embodiments of the invention relate to high throughput, reliable replication of transformed data in information systems.
2. Description of the Related Art
Information often exists in multiple locations within an organization, within an enterprise, or even more globally across enterprises. Often this information is represented in an optimal way for local applications, and, usually, not all applications within the organization, within an enterprise, or even more globally across enterprises require, use, or even want or are allowed to access all of the information that exists. However, it is increasingly common that more and more of this information is shared and utilized in local applications. The sharing, transformation, and local use of that information is a root capability that allows a consistent view of data, enables cross-selling scenarios, and enables significant business optimization and identification of new business opportunities.
Conventional systems result in high implementation costs, which often are incurred on every deployment of trying to integrate one local source system executing the local applications to some other target system. The local target calls tend to be very specific to the target system and often times very specific to the configuration of that target system. Conventional systems used in replication and data transformation stop short of integrating with local target calls using parameter values derived from the data transformation. This results in a high cost to program the local target calls due to the cost of implementing and integrating the local target calls within the conventional systems. This high cost results in many of these types of projects not being able to be funded in the first place and taking significant time to implement, test, and deploy, particularly at an enterprise level.
Conventional systems that require strong data integrity rely on traditional transaction serialization of the entire workload, sometimes even using two phase commit between the two systems to insure data integrity. This serialization has serious performance consequences on both the source and target systems due to the latency that is introduced into the source system transactions (which now become distributed transactions). This latency means that increased serialization occurs due to more locks being held for longer duration which reduces overall throughput. It also has high availability impacts since the overall system depends on the entire overall system being available all the time. Conventional systems rely on source transaction serialization to order the execution on the target system and, hence, are unable to be fully parallelized to take advantage of high performance clusters.
There is a need for high throughput replication (due to the multiple locations), transformation (due to “local” representations of data, and the applications that run against that data), and the need for data integrity (consistent trusted enterprise-wide data). Thus, there is a need for high throughput, reliable replication of transformed data in information systems.