In computer science, ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that transactions are processed reliably. Generally, a transaction can be regarded as a logical unit of work that may include any number of data changes such as file or database updates. In the context of databases, a single logical operation on the data is normally called a transaction. A transaction may involve one or more operation steps. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account (first operation step) and crediting another (second operation step), is a single transaction.
Atomicity requires that each transaction is “all or nothing”: if one part (one or more operation steps) of the transaction fails, the entire transaction fails, and the database state is left unchanged. That is, in an atomic transaction, a series of database operations either all occur, or nothing occurs. To the outside world, a committed transaction appears (by its effects on the database) to be indivisible (“atomic”), and an aborted transaction does not happen (it is “rolled back”). In other words, atomicity means indivisibility and irreducibility. The terms “commit” and “rollback” (or “roll back”) will be described below in more detail.
The consistency property ensures that any transaction will bring the database from one valid state to another. Any data written to the database must be valid according to all defined rules, including but not limited to constraints, cascades, triggers, and any combination thereof. In database systems, a consistent transaction is one that does not violate any integrity constraints during its execution. If a transaction would leave the database in an illegal state, it is aborted and an error is reported. Consistency ensures that any changes to values in an instance are consistent with changes to other values in the same instance.
The isolation property ensures that the concurrent execution of transactions results in a system state that would be obtained if transactions were executed serially, i.e. one after the other. Durability means that once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors. In a relational database, for instance, once a group of SQL statements execute, the results need to be stored permanently (even if the database crashes immediately thereafter).
In computer science and data management, a commit is the making of a set of tentative changes permanent. A popular usage is at the end of a transaction. A commit is an act of committing. A COMMIT statement in SQL ends a transaction within a relational database management system (RDBMS) and makes all changes visible to other users. In SQL, the general format of a transaction is to issue a BEGIN WORK statement, one or more SQL statements, and then the COMMIT statement. A COMMIT statement will also release any existing savepoints that may be in use.
In terms of transactions, the opposite of commit is to discard the tentative changes of a transaction, a rollback. In database technologies, a rollback is an operation which returns the database to some previous state. Rollbacks are important for database integrity, because they mean that the database can be restored to a clean copy even after erroneous operations are performed. By rolling back any transaction which was active at the time of a database crash, the database is restored to a consistent state. In SQL, ROLLBACK is a command that causes all data changes since the last BEGIN WORK, or START TRANSACTION to be discarded by the RDBMS, so that the state of the data is “rolled back” to the way it was before those changes were made. That is, if a ROLLBACK statement is issued, all the work performed since BEGIN WORK was issued are undone.
An example to illustrate the roles of the commit and roll back is a money transfer between two checking accounts. In certain scenarios commit is called atomic commit in order to underline that the atomic commit is an operation in which a set of distinct changes is applied as a single operation. To start, first, 100 dollars are removed from account X. Second, 100 dollars are added to account Y. If the entire operation is not completed as one atomic commit, then several problems could occur. If the system fails in the middle of the operation, after removing the money from X and before adding into Y, then 100 dollars have just disappeared. Another issue is if the balance of Y is checked before the 100 dollars are added. In the latter case, the wrong balance for Y will be reported. Because of the commit order neither of these cases can happen: in the first case of the system failure, the atomic commit would be rolled back and the money returned to X. In the second case, the request of the balance of Y cannot occur until the atomic commit is fully completed. That is, atomic commits in database systems fulfil two of the key properties of ACID, atomicity and consistency. Consistency is only achieved if each change in the atomic commit is consistent. If the system crashes or shuts down when one operation has completed but the other has not, and there is nothing in place to correct this, the system can be said to lack (transaction) consistency. With a money transfer, it is desirable that either the entire transaction completes, or none of it completes. Both of these scenarios keep the balance in check.
Guaranteeing ACID properties in a distributed transaction across a distributed database where no single node is responsible for all data affecting a transaction presents additional complications. A distributed database is a database in which storage devices are not all attached to a common processing unit such as a Central Processing Unit (CPU), controlled by a distributed database management system (together sometimes called a distributed database system). It may be stored in multiple computers, located in the same physical location or may be dispersed over a network of interconnected computers. Unlike parallel systems, in which the processors are tightly coupled and constitute a single database system, a distributed database system comprises loosely-coupled sites that share no physical components.
The two-phase commit protocol (2PC) may be regarded as a type of atomic commitment protocol (ACP) which guarantees ACID properties, as it provides atomicity for distributed transactions to ensure that each participant in the transaction agrees on whether the transaction should be committed or not. Briefly, in the first phase, one node (the coordinator) interrogates the other nodes (the participants) and only when all reply that they are prepared, the coordinator, in the second phase, formalizes (completes) the transaction. If any participant (cohort) replies that it is not prepared, the coordinator sends a rollback message to all the participants.
One disadvantage of the two-phase commit protocol is that it is a blocking protocol. If the coordinator fails permanently, some cohorts will never resolve their transactions: After a cohort has sent an agreement message to the coordinator, it will block until a commit or rollback is received.