A database shard is a horizontal partition of a database. Each such individual partition is referred to as a shard or database shard. Horizontal partitioning is a database design principle whereby different subsets of rows of a database are held in separate horizontal partitions. Each such horizontal partition thus forming a shard. When a database is horizontally partitioned into a plurality of shards this offers potential advantages in terms of scalability as the shards can be located on different shard stores, which are for example database servers, thus allowing the database to grow beyond the performance and storage capacity limits of a single database server.
Various methods of sharding a database may be used to meet such scaling and partitioned database architectures have emerged that automate sharding and load balancing across different shard stores to make sharding easier. These architectures typically use key-based hash partitioning or range partitioning to assign data to shard stores of the distributed computing system storing the database. Examples of key-based hash partitioning are for example described in US2014/0108421 in which a hash in the form of a modulus or a more sophisticated hash of the key is calculated and each of the shard stores is assigned a specific range of these calculated hashes, of which it is expected that the distribution will be balanced. A first problem with such an approach is that in large scale database systems computation of these hashes in function of the keys requires considerable computing power and time and thus causes an increased latency when handling requests for data of these keys. Additionally, even when using complex hashing mechanisms, it is difficult to guarantee a balanced distribution among the different data stores, especially for a large scale distributed database of which the keys and their associated data cannot be reliably assessed beforehand.
Still a further problem is, as mentioned for example in US2014/0108421 is a decreased performance of such a distributed database system that a high percentage of distributed operations. During such distributed operations a plurality of shard stores must be accessed in order to perform a request requiring data of a plurality of keys, such as for example a list of order records of a single customer. In order to decrease the share of distributed operations and increase the share of single shard read and write operations US2014/0108421 proposes the use of a shard control record that correlates monotonic key ranges to a plurality of shard stores on which the records or rows are distributed by means of a subsidiary hashing method. Although, this results in an increased share of single shard read and write operations when handling data correlated to a particular Customer ID as shown in FIG. 3, and although the shard control record provides for a shard list associated with a key range instead of needing to store this meta-data on the individual key level, still for every key of the list for which data needs to be retrieved the subsidiary hash needs to be calculated in order to determine which shard store of the shard list is to be accessed. Additionally the use of a monotonic key, for example the customer ID, results in poor performance in standard application level situations in which for example ordered lists of the customers need to be produced for retrieval and/or selection by the user. It is clear that in such a standard case, such as for example where a user is presented with a user interface for paging through an alphabetically sorted list of customers, this will result in a high number of access requests to the shard control record, as the monotonic customer id of neighbouring customers in the alphabetically sorted list are not necessarily in the same customer id range, and even if they would be in the same range, there is no guarantee that they will be stored on the same shard store in the shard list. Additionally the approach of US2014/0108421 requires an always up-to-date shard control record at a central location which is accessible to all shard stores of the system, which creates a single point of failure and puts limits on scalability and responsiveness of such a system, especially in a large scale distributed database system in which a large number of shard stores are involved.
A further method for sharding a database is known from WO2013/147785 in which the index for a replicated object storage system is sharded by means of the same hash-based sharding methodology as being used for distributing the objects amongst the storage nodes and subsequently these index shards are distributed amongst all storage nodes in the system. Also here it is required to keep all index shards stored on the different storage nodes in sync, which results in an increased latency and puts limits on the scalability of such a system. Additionally the index creates hash-based shards, which result in a high rate of distributed operations in which a plurality of different shard stores need to be accessed when performing a standard operation such as for example listing an alphabetically sorted list of data objects stored in a selected container, such as for example a folder, group, label, etc. This effect is further aggravated as it manifests itself at the level of both requests made to the sharded index and requests related to the data objects themselves.
Still a further method of sharding a database is known from US2012/0271795 in which a coordination service manages the distribution of requests relating to keys of a total key range to a plurality of nodes each being responsible for a local key subrange which is a part of the total key range. The local key subrange of each of the nodes is selected according to the number of nodes and the number of rows or keys in the database table. Such a system requires all local key subranges on the nodes to be in sync with each other and with the coordination service, which puts limits on the scalability. Additionally if no knowledge is available about the key distribution in the database for the total key range there is a high risk that the chosen local key subranges will result in an unbalanced distribution of data amongst the nodes.
Still a further method of sharding a database is known from US2012/0254175 in which the database comprises data identifiable by keys comprised within a global ordered range. A plurality of shards, also referred to as chunks, is provided, each shard configured to handle requests for data of at least one key within a local subrange, this local subrange comprising an ordered subrange of said global ordered range, which is for example defined by means of a range between a minimum value of the key and a maximum value of the key. A router process which routes requests to the correct shards accesses information from a configuration server that stores and information about each shard, such as for example the minimum and maximum key value, and the shard store on which this shard is stored. It is clear that at all times this information of the configuration server must be in sync with the actual situation on each of the shard stores, which leads to an increased latency and puts limits on the scalability. In order to improve flexibility in rebalancing the system the maximum size of each of the shards is limited to for example 200 MB and when a shard reaches this maximum size it is split in two new shards each comprising a share of the local subrange of the split shard. In this way a large number of small shards are available on each of the shard stores of the system and rebalancing can performed by simply moving these small shards from their shard store to another less loaded shard store. However, this requires the configuration server to be constantly in sync with these frequent updates resulting from the high number of shards, frequently created new shards and frequent relocation of shards amongst the shard stores, which puts limits on scalability of the system and increases the latency as the router must be updated by the configuration server with the latest configuration information before a request can be executed. Additionally the smaller the size of the shards, the higher the chance that standard request resulting in data relating to order subsets of keys, such as for example an alphabetically ordered list of data objects in a container, files in a folder, customers in a table, etc. will result in the need to access a plurality of shards distributed on a plurality of shard stores thereby reducing the share of single shard operations and resulting in a corresponding performance reduction.
A dynamically scalable redundant distributed storage system is further disclosed in WO2012/068184. It discloses a storage system using replication, for example RAID, or using an error correcting code or ECC, such as for example erasure codes, to achieve a certain level of redundancy. The storage system comprises a file manager controller and a storage manager controller, correlating file identifiers and data blocks to the storage devices storing the file and data blocks. This file manager controller can be implemented using distributed hash tables, which are for example implemented as a hash table list comprising an entries correlating a range of unique file identifier values for which the file manager is responsible, as for example shown in FIG. 2C of this publication. As shown each file manager must be aware of its own local subrange of key values, which is a share of circular total key range. Additionally it must also be aware of at least information about the file manager managing a local subrange preceding its own local subrange and the file manager managing a subrange succeeding its own local subrange. It is acknowledged that due to the distributed nature of the distributed hash table this hash table list available to a node may not be completely accurate when used, since constructing the list takes time, during which a node failure or distributed hash table rebalancing might occur. The system relies on the assumption that even if the information is outdated, this outdated information will in any case lead to a node with a range that is closer to the desired node thereby eventually leading to access to the desired node, via one or more intermediate hops. However, in large scale distributed database systems, this assumption is not always true and could lead to irretrievable data or unacceptable latency when for example the information of neighbouring nodes and/or their corresponding local subranges would be outdated. It is not hard to imagine a situation in which the request will hop back and forward between two outdated nodes which still have each other identified as neighbouring nodes. Additionally the hash based sharding, requires a suitable hash to be generated for each key, for example a file identifier, which, as explained above will result in a reduced share of single shard operations when performing standard requests such as for example creating an alphabetically ordered list of data objects in a container, files in a folder, customers in a table, etc. This performance degradation is even worse in the system of WO2012/068184 as in order to allow for a certain level of rebalancing flexibility the system makes use of two distributed hash table systems one for the file managers responsible for management of the file meta-data and one for the storage managers responsible for management of the storage devices.
Thus, there remains a need for an improved computer implemented method for dynamic sharding of a database that overcomes the disadvantages mentioned above and ensures scalability in a robust and simple way, guaranteeing increased performance when handling standard requests resulting in data relating to ordered subsets of keys.