The invention relates generally to computer systems, and deals more particularly with synchronization of data between distributed computer systems.
The Internet is an example of a distributed computer system comprising computers and computer networks of different types. Networks, such as mobile phone networks, corporate intranets and home networks, can also exist outside the Internet. Within all of these networks, computers and other devices can communicate with one another and share resources despite their geographic separation. Such resources may include printers, disk drives, data files, databases and other data objects.
To allow the sharing of a resource, a computer program executing on a computer communicates with other computers by passing messages. For example, a client using HyperText Transfer Protocol (HTTP) specifies a server's URL in a message to request a resource. Then, the server looks up the path name of the requested resource. If the resource exists, the server accesses the resource and sends back the requested data in a reply message to the client device.
One example of a distributed system is a cluster or group of servers which all provide the same service/application using the same data to provide load balancing and/or redundancy. In a distributed system it is likely that more than one client device will occasionally want to access the same shared resource at approximately the same time. For proper operation, access to the resource must be synchronised such that the proper data is read or written. The data must be consistent throughout the distributed system. The problem is compounded by large volumes of transactions, when communication times are extended, and when fast performance and high availability are desired.
Web service providers and suppliers currently have a distributed model in the implementation of a web service. One or more web services are deployed across a number of geographically dispersed physical environments. At the same time, each of these physical environments requires a consistent view of some of the data that is common to the web services. This has proven difficult to both synchronize data among the implemented web services and allow multiple systems to update the same data.
In order to update data across multiple servers, it was known to employ a “two phase commit” protocol. This protocol allows all of the servers involved in a transaction to either accept an update or to rollback an update, thereby maintaining consistent data. To achieve this, one of the servers takes on a coordinator role to ensure the same outcome on all the serves. In phase commit, a client device sends a request to commit or rollback a transaction to the coordinator server. The coordinator server forwards the request to all other servers which maintain the same data. If the request from any participating server is to abort, i.e. not enter a transaction, the coordinator informs all other participating servers to roll back the transaction before it is considered entered. If the request from a server is to commit a transaction, the coordinator sends a request to all the other participating servers asking if they are prepared to commit the transaction. If a participating server can commit the transaction, it will commit as soon as the appropriate records have been updated in permanent storage and the participating server is prepared to commit.
A disadvantage of using two phase commit is that it requires all the participating servers to support the two phase commit protocol. Even if the participating servers all support the two phase commit protocol, there may be differences in implementation between different vendor solutions across multiple physical environments.
U.S. patent application publication 2002/0188610 discloses a data storage and access system employing clustering. A data management system comprises a plurality of application servers, web servers and data servers. The data management system also includes a session manager directing users accessing the system to a subset of web servers, application servers and data server's based on the characteristics of the users.
U.S. patent application publication 2002/0188610 discloses that there are two forms of replication strategies, master-slave and master-master mode. Each has its own conflict resolution algorithm and work delegation technique. The data exists in two different computer systems at the same time.
An object of the present invention is to effectively synchronize data in a distributed computing environment, such as a load balanced environment.