Database systems may store a set of tabular data having rows and columns in a variety of ways. A database system may store data in volatile and non-volatile memory, in a file located in conventional file storage local to the database system, in a file located in conventional file storage attached to one or more storage systems located on a network, or the like. A database system may typically add or remove data to a set of data and therefore the set of data may shrink or grow over time.
However, as a set of data grows, the set of data may grow too large and may exceed the storage capabilities of the location where it is stored. For example, a data set may be stored as a file on a hard disk drive. If the data set grows larger than the capacity of the hard disk drive, the data set may either be moved as a whole to a hard disk drive with larger capacity or may be divided into one or more pieces and each piece may be moved to one of multiple physical storage locations.
Once a data set has been divided, the database system may implement fixed functionality to locate and retrieve the data stored in each of the multiple physical storage locations. For example, a database system may include two physical storage locations. The database system may choose to store data associated with the first row of the data set in the first physical storage location. The database system may further choose to store data associated with the second row of the data set in the second physical storage location. The database system may then choose to store data associated with the third row of the data set in the first physical storage location. Such a pattern may be repeated for each row of the dataset. The database system may then reverse such fixed functionality to locate and retrieve the data. For example, the database system may retrieve the data associated with the second row of the dataset by accessing the data stored on the second physical location.
If either the first or second physical storage location should also become filled to capacity, an additional physical storage location may be added to the database system. However, because the database system may utilize the fixed functionality to divide as well as retrieve data, all of the data stored on the first and second physical storage location may be required to be reshuffled to accommodate the addition of the third physical storage location.
For example, the database system may now choose to store data associated with the first row in the first physical storage location. The database system may further choose to store data associated with the second row in the second physical storage location. The database system may then choose to store data associated with the third row in the third physical storage location. Note that the data associated with the third row was previously stored in the first physical storage location. However, to adhere to fixed functionality that may be employed by the database system, the data associated with the third row may be required to be reshuffled to the third physical storage location.
Alternatively, if the database system does not employ fixed functionality to divide and store the dataset, the division and lookup functions may be required to change each time a new physical location is added to or removed from the database system. For example, the database system may include a non-fixed function that selects a physical location for a row in the dataset by performing a mathematical function based on the row number and the number of physical storage locations. Because the both the location of each row of data and the mathematical function determining the location of each row is based on the number of physical storage locations, both the mathematical function and the location of each row must be recalculated and reshuffled each time a physical storage location is added to or removed from the database system.
A system that implements a method that allows data to be easily moved from one physical storage location to another without requiring a reshuffling of all data on all physical storage locations each time a physical storage location is either added or removed may be useful.