Data-centric business applications use distributed systems to create, modify, transfer, share, acquire, store and/or verify data located in different locations, hereinafter referred to as nodes. Such types of data, for example, includes data associated with data-centric business applications such as on-line stores, patient portals, network transactions, merging databases, etc.
Distributed systems share and transmit information amongst multiple nodes. Generally speaking, a node is a location that stores information within the distributed system. Examples of nodes include a computer on a network, a server on a network, or one of multiple data storage locations on a computer or server.
Maintaining data integrity when utilizing a distributed system in data-centric business applications becomes problematic when data is created, modified, transferred, shared, acquired, stored and/or verified at one or more nodes across a distributed system. For example, a server computer on a network may be configured to maintain a backup copy of a document created on a client computer.
However, if the server computer and the client computer are not connected via the network when a copy of the document is modified on the client computer, then the backup copy of the document stored on the server computer is not updated in accordance with the modified version of the original document because there is no established connection. Therefore, data integrity is not maintained across the nodes within the distributed system because the backup copy of the document stored on the server computer is not the same as the original document stored on the client computer.
Synchronization is a conventional approach to solving such data integrity problems. Conventional synchronization has provided a way to directly transfer data point-to-point from one node to another within a distributed system. In the example relating to a document backup system explained above, the server computer maintains an exact copy of the original document created and/or modified on the client computer.
Thus, synchronization provides a direct file transfer by comparing data bits and/or copying the data from a first location to another location in order to provide the same document in two different locations. This direct file transfer thereby maintains data integrity across the first location and another location.
However, mere point-to-point synchronization does not solve higher level policies necessary to maintain data integrity across a more complex distributed system. For example, when merging two databases containing a list of employee names into a single database, mere file transfer results in several data integrity problems, such as duplicated names. Name duplication within a merged database may then lead to internal processing errors related to employee information. For example, Arnold Johnson might not receive his paycheck because the paycheck was sent to another Arnold Johnson. In this exemplary scenario, the distributed system does not maintain data integrity because of the confusion that results when having duplicate names. Conventional point-to-point synchronization does not solve ensuring that the distributed system exchanges data so that data integrity is eventually established.