1. Field of the Invention
This invention relates to computing systems and, more particularly, to index data structures configured to store records of data relationships within computing systems.
2. Description of the Related Art
Many different types of computing applications require the relationships among different data items to be maintained in a persistent way for reference during data processing. For example, a computer file system may be configured to store and manage information indicating the relationships between individual files (e.g., as identified according to some type of file name) and the specific storage location(s) within virtual or physical storage devices (e.g., logical volumes, hard disk drives, etc.) at which corresponding file data is stored. To access the data of a particular file according to its name, the file system may consult data structures that map the name to various storage locations, and then access the indicated storage locations to retrieve the file data.
Depending on the type of application in which it is employed, a data structure that reflects relationships among data items may cause disruption of application operation or data loss if the data structure becomes damaged or corrupted, for example due to hardware or software faults that arise during application operation. Correspondingly, in some instances, such data structures may be replicated in such a way that reduces the likelihood of unrecoverable data loss. For example, a copy of the data structure may be stored at a different geographic site from a primary instance of the data structure. The site may be selected and configured such that a catastrophe that affects the primary data structure instance is unlikely to affect the copy.
As the number of relationships to be tracked among data items increases, it may be difficult to correspondingly scale the resources of the data structure that stores the relationships in a way that minimally impacts data structure performance. For example, if the size of the data structure exceeds the available physical memory of a computer system seeking to manipulate it, read or write throughput to the data structure may become limited by the rate at which the computer system's virtual memory system can swap portions of the data structure to and from physical memory. Additionally, in cases where the data structure is replicated, it may be necessary to periodically synchronize the replicas, for example to ensure that a standby copy is relatively consistent with a working copy, or to preserve operational consistency when different replicas are actively being used. However, synchronization of data structures may be constrained by the available communication bandwidth between the systems implementing the data structures to be synchronized, as well as the computational overhead of performing the synchronization. For example, a brute-force synchronization approach in which two large data structure instances are compared in their entirety for differences may require a substantial amount of time and/or bandwidth to transmit the contents of one instance to the system implementing the second instance, as well as substantial processing resources to perform item-by-item reconciliation within the instances. Such constraints may effectively reduce the frequency with which synchronization can feasibly be performed or limit the size of the data structures for which synchronization may be supported.