1. Field
The disclosed embodiments generally relate to techniques for improving performance in database systems. More specifically, the disclosed embodiments relate to a technique for migrating data items from a source cluster to a destination cluster in a database system while the database system continues to process live database traffic.
2. Related Art
As the popularity of a web-based service increases, the service may need to expand its data storage infrastructure to process a larger volume of requests. This expansion typically involves migrating a large amount of data from one database cluster to another. For example, the migration can involve moving half of the user accounts from an original cluster to a new cluster. This enables the new cluster to service requests in parallel with the original cluster, thereby enabling the system to process a larger volume of requests. This performance improvement gained by using multiple database clusters is particularly significant because requests directed to a single database cluster often become bottlenecked waiting for cluster-level write-locks to be released. This waiting time often results in a user-perceived latency, which can adversely affect the user's satisfaction with the service.
The task of migrating data within a database is commonly encountered, and many tools exist that facilitate this process. However, existing data-migration tools typically cause a significant downtime for the database system, and this downtime is unacceptable for many web-based services that need to be highly available, such as services associated with financial transactions, email services or search services.
Table 1 presents exemplary asynchronous code that copies data items from a source collection to a destination collection in accordance with the disclosed embodiments.
Table 2 presents exemplary asynchronous code that uses gevent to copy data items from a source collection to a destination collection in accordance with the disclosed embodiments.