1. Technical Field
The disclosed embodiments relate generally to distributing storage structures and, more specifically, to distributing policy-based storage structures across multiple servers.
2. Background
A database consists of a number of rows of data with each row divided into columns or fields. One field, typically, is designated as the key field. In a database structure, each key should be unique. Keys may be generated automatically by the database as new data is inserted by a user, but the user may also generate keys and include them with the inserted data. If a user tries to add data specific to a key already in use, an error results. For example, the new data may overwrite the data already associated with that key.
With traditional approaches, distribution of data in a database over a set of servers is done manually or by hand. For example, in a banking context, a database may include bank data, pending transactions, and other banking information. Customers of a bank have a bank account with a bank account number. At any time, a number of operations may be performed on each account. Each piece of data, each transaction to be performed, or each pending operation is associated with an account number (a key). If these operations are stored on a normal database structure, using the account number as the key, only one piece of information may be associated with that account number at any one time. Because only one entry per key is allowed in a database, when multiple transactions are pending for the same account number, a problem arises and errors may result.
If more than one operation is being performed on an account, then a queue may be kept for that account number. With a large number of accounts, a single server may not be able to handle all of the data and queues for each account. As a result, accounts and account queues should be distributed over a set of servers. Such a task is difficult and cumbersome to manage. Adding or taking away servers only adds to the complexity of managing, as the queues and data should be rebalanced over the set of servers. This can amount to hundreds of thousands of queues that need to be manually balanced or homed over a set of messaging servers.
In any context (banking or otherwise) including a large database with potential multiple entries, transactions, or operations per key, queues may be created for every key having multiple entries. Such a system is difficult to manage as it is difficult to manually schedule such a large number of queues. The scheduling is not automated and is inefficient. In addition, when servers enter or exit the server cluster, even more complexity is introduced, as a manager must then recalculate and redistribute the data and pending transactions over a new set of servers.