In general, computing systems must deal with a large amount of data. This is particularly true for computing systems that provide computing services such as backup services, content management, contact management, and the like, for many different clients. The amount of data can be terabytes and larger in sized.
The data managed by these computing systems may be accessed frequently depending on the service. Further, some of the data changes over time and may be de-duplicated. As a consequence of these changes, the data tends to become fragmented over time. When data in a file system becomes overly fragmented, the performance of the computing system begins to degrade.
Locality is a way to measure how fragmented a file is in a file system. When a file is stored as a segment tree having segment levels (e.g., L(0)-L(6)), the performance of locality measurement is sensitive to segment locality. Poor locality in the L(0) level, which includes data segments, results in multiple index lookups. This can impact performance. As the locality of the system continues to degrade, it takes a longer time to repair the locality. Further, locality measurement is not incremental in conventional systems. Systems and methods are needed to improve locality measurement and locality repair in a file system.