Generally speaking, in a typical database system that maintains both user data and system data, one may classify data problems into two categories. The first category is physical errors caused by hardware, operating systems, or internals of the database system itself. The errors in the first category may affect both user data and system data. For example, a transaction may fail in the middle of processing due to a system error such as power failure; as a result, various repositories of data, including user data kept by the database system relating to the failed transaction, may be left in an inconsistent state. To reduce incidents of physical errors, one may use reliable hardware, redundancy, backup, powerful, stable operating systems, or mature database system products. Furthermore, problems such as a database system being left in an inconsistent state after a system error may be corrected using redo and undo logs to a certain extent. Because of technology improvements, physical errors in a database system are nowadays rare and, when a physical error does happen, the database system has effective tools to take corrective actions and prevent partial, inconsistent data being persisted in the database system at the physical level.
The other category of data problems is logical errors caused by applications. This category affects mostly user data at a logical level. Integrity (such as atomicity, consistency, isolation and durability) of each transaction in a set of one or more transactions may have been properly maintained from a transaction processing perspective. However, user data created by certain transactions in the set of transactions may be logically erroneous because application logic relating to the certain transactions is erroneous. For example, where an application that moves funds in Euro currency is used to move funds in other currencies without applying appropriate conversion factors, user data manipulated by such an application would be logically erroneous even though integrity of each such fund moving transaction might have been properly maintained. The user data at the physical level would appear to be correctly in a consistent state—e.g., no constraint is violated, no transaction integrity is breached, indexes are correctly maintained, logs are correctly created, system tables are correctly updated, etc. However, since some funds may be over-transferred (e.g., British Sterling) while some others may be under-transferred (e.g., Japanese Yen), the user data are incorrect in a logical sense (i.e., at a logical level).
To fix data problems at the logical level, a database administrator may typically take the database system offline, spend a considerable time to trouble shoot root causes, and come up with some corrective measures if feasible. However, such fixing by the database administrator would likely be error-prone, because of the level of difficulty involved in determining exactly what transactions are involved in a logical error. Furthermore, significant downtime may be incurred under this approach.
Alternatively, a database administrator may simply resort to rolling back the database system to a database image existing at a particular time in the past. This approach has at least two disadvantages. One is that if the particular time is too far in the past, a large amount of good data may be lost. The other is that if the particular time is too recent, the data problems may only be fixed in a partial, inconsistent manner.
Therefore, a better mechanism that would improve selectively removing user data changes made by transactions is needed.