Database systems typically maintain their data in a sorted index in order to efficiently find and retrieve information. When inserting new data into an existing database, the database places the new data in the appropriate location to maintain the sorted index. The insertion operation needs to read existing data to determine where to insert the new data. The time it takes to perform this sorted insert operation depends on the size of the index. As the index grows, the database must read and move more data in order to maintain the sorted index. As a result, an insert operation takes longer as the database grows larger.
Another challenge in maintaining large indices is balancing between densely packing index data and allowing empty spaces for faster incremental inserts. A denser storage requires more splitting during an incremental insert, while a sparser storage leads to more disk I/O reads during querying.
A segmented index approach that maintains separate indices over different data ranges can be used to overcome these challenges. However, this approach requires accurate data model forecasting, because if the data range partitioning is not accurate, segment size can differ substantially and individual segments may become a bottleneck.
In the drawings, like reference numbers generally indicate identical or similar elements. Additionally, generally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.