In many replicated data stores, for example in the WINDOWS® Filing System (WinFS) data store, metadata identifying stored items is kept after the items themselves are deleted. This metadata is referred to herein as a “tombstone.” Keeping tombstones facilitates tracking deletion of items and propagating item deletion to other data stores through replication and synchronization processes.
WinFS uses a “tombstone table” in order to track deleted items. While the tombstone table generally serves its purpose, one problem is that over time the tombstone table can grow very large and clog the system. A tombstone is generated for each deleted item, and there is no mechanism for removing tombstones.
Removing tombstones would solve the problem of ever-expanding tombstone tables, but such removal is easier said than done. Removing tombstones is problematic in scenarios involving multi-master database synchronization.
For example, consider an item that is stored in a first database, and subsequently propagated to three other databases. The item is then deleted from the first database. A tombstone is placed the first database's tombstone table. However, the first database is not synchronized with the others for a long period of time, and a hypothetical automated process goes through and removes the tombstone, thinking the item is sufficiently old.
After the tombstone is deleted, the first database again synchronizes with the other databases. Recall that the other databases still contain the item which was deleted from the first database. However, the first database has no record that the item was deleted, and neither do the other databases. Thus, the item would be propagated back to the first database, thereby “resurrecting” the deleted item and causing data corruption.
The above described problem has a number of variations of various degrees of complexity, all of which ultimately result in unacceptable data corruption. There is a need in the industry for an effective way to clean-up tombstones in a setting involving multi-master database synchronization, without loss of convergence.