Cloud computing is a model for enabling on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. Cloud computing providers may offer infrastructure as a service (IaaS) and platform as a service (PaaS). One service that has been challenging to move to the cloud computing model is managed data storage, which is conventionally performed by databases. Data storage is stateful, which makes data as a service (DaaS) more challenging than other categories of cloud computing. Traditional data storage uses databases such as structured query language (SQL) and not only SQL (NoSQL) databases. Databases support the XA (X/Open XA) architecture and XA transactions, which are transactions that consist of multiple operations that access resources. For example, a banking application may conduct an XA transaction that consists of two operations (1) deduct money from a first bank account and (2) add the deducted money to a second bank account. Typically, either both of the operations relating to the XA transaction will be permanent, if successful, or none of them will occur, and the data in an in-memory data grid relating to the bank accounts can be rolled back to a previous state as if the transaction never occurred.
In traditional data storage systems, such as databases, consistency is usually achieved by logging information to disk. If a process fails after a transaction is prepared and before it is completed, then a recovery log is read in order to move the system into a consistent state. The conventional disk-based recovery-log approach is costly because it involves a disk write.
Traditional database data storage usually does not work for DaaS because databases do not scale. Databases tend to run on a single machine or a few machines running in a fixed cluster. Therefore, databases typically are not distributed by nature. This becomes a problem in the cloud because there is no guarantee in a cloud environment that a particular server will be available at any given time. The lack of distribution for databases hampers elasticity and high availability, two of the parameters for cloud computing services.
Distributed databases, also known as data grids and in-memory data grids, have been recognized as a better alternative to databases in clouds. Data grids can scale up to thousands of nodes. Data grid platforms also improve the scalability of non-cloud applications by removing database bottlenecks and single points of failure. Some data grids may support XA transactions, but not recovery of transaction state data in case the state data is lost.