Scalability is an important requirement in all data storage systems. Different types of storage systems provide diverse methods of seamless scalability through capacity expansion. In some storage systems, such as systems utilizing redundant array of inexpensive disk (“RAID”) controllers, it is often possible to add disk drives (or other types of mass storage devices) to a storage system while the system is in operation. In such a system, the RAID controller re-stripes existing data onto the new disk and makes the capacity of the other disks available for new input/output (“I/O”) operations. This methodology, known as “vertical capacity expansion,” is common. However, this methodology has at least one drawback in that it only scales data storage capacity, without improving other performance factors such as the processing power, main memory, or bandwidth of the system.
In other data storage systems, it is possible to add capacity by “virtualization.” In this type of system, multiple storage servers are utilized to field I/O operations independently, but are exposed to the initiator of the I/O operation as a single device, called a “storage cluster.” Each storage server in a cluster is called a “storage node” or just a “node.” When data storage capacity becomes low, a new server may be added as a new node in the data storage system. In addition to contributing increased storage capacity, the new storage node contributes other computing resources to the system, leading to true scalability. This methodology is known as “horizontal capacity expansion.” Some storage systems support vertical expansion of individual nodes, as well as horizontal expansion by the addition of storage nodes.
Systems implementing horizontal capacity expansion may choose to concatenate the capacity that is contributed by each node. However, in order to achieve the maximum benefit of horizontal capacity expansion, it is necessary to stripe data across the nodes in much the same way as data is striped across disks in RAID arrays. While striping data across nodes, the data should be stored in a manner that ensures that different I/O operations are fielded by different nodes, thereby utilizing all of the nodes simultaneously. It is also desirable not to split I/O operations between multiple nodes, so that the I/O latency is low. Striping the data in this manner provides a boost to random I/O performance without decreasing sequential I/O performance. The stripe size is calculated with this consideration, and is called the “zone size.”
When data is striped across multiple nodes, the process of re-striping data when a new node is added is lengthy and inefficient in most contemporary storage systems. In particular, current storage systems require the movement of a massive amount of data in order to add a new node. As an example, in order to expand a four node cluster to a five node cluster using current data migration methodologies, only one in twenty storage zones (referred to herein as “zones”) remains on the same node, and even those zones are in a different physical position on the node. Hence, the current process of migration is effectively a process of reading the entire body of data in the system according to its unexpanded configuration, and then writing it in its entirety according to expanded configuration of the cluster.
Such a migration process typically takes several days. During this time, the performance of the cluster is drastically decreased due to the presence of these extra migration I/O operations. A complicated method of locking is also required to prevent data corruption during the data migration process. The storage capacity and processing resources of the newly added node also do not contribute to the cluster until the entire migration process has completed; if an administrator is expanding the node in order to mitigate an impending capacity crunch, there is a good likelihood that the existing capacity will be depleted before the migration completes. In all cases, the migration process is cumbersome, disruptive and tedious.
It is with respect to these considerations and others that the following disclosure is presented.