A distributed application (or service) is an application with many copies of the application distributed among a plurality of servers. An application could be, for example, a database or a cache used to support client operations. Distribution of copies of an application among servers on a data center provides advantages such as fault toleration, faster service, and scalability. Fault toleration is possible because if one of the applications or servers fails, other copies of the application exist to continue to service clients. Faster service is possible because multiple copies of the application exist, thus allowing clients to connect to a server experiencing a relatively low load at the time of connection. Scalability is possible because additional servers can be added to accommodate an increase in clients.
Distributed applications pose new challenges, as compared to non-distributed applications existing as a single copy. All copies of a distributed application must be synchronized in a consistent manner so that clients accessing different copies of the application at the same time receive identical responses from their respective copies. For example, if a client modifies a database on one server, copies of that database on other servers must be updated before servicing requests, so as not to service requests from an outdated version of the database. One approach is to have a local storage unit on each server where every request to the application is logged, thus tracking modifications to the application. Subsequently, these local log entries would be shared with other servers for synchronization. However, difficulties arise in ascertaining the exact order of writes to various copies of the application on different servers.
Another challenge is rebuilding an application on a server after the application has failed, such as from a power failure or from an attack by a malicious user. A local storage unit on each server tracks modifications to the application, allowing the application to be restored from a base state. However, a storage unit for each server tends to limit scalability because it requires an additional storage unit for every additional server, whether or not storage space is available on other available storage units already connected to servers.
To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures. It is contemplated that elements disclosed in one embodiment may be beneficially utilized on other embodiments without specific recitation.