Storage systems are capable of persisting files and folders as represented by a hierarchy of folders (directories), each containing files and additional child folders. In addition to persisting the constantly changing set of files and folders, storage systems are capable of taking consistent point-in-time copies or backups of a file system exactly as it exists at the time the backup is taken.
A collection of files held on a storage system may also be associated with a virtual machine. These files will contain an additional virtualized file system inside of them. In addition, many virtual machine hypervisors can take point-in-time backups of the file systems of the virtual machines they coordinate.
A problem in the prior art is that although the underlying infrastructure makes the creation of point-in-time backups of file systems simple, there currently does not exist any mechanism to catalog and easily and effectively search the contents of the file system backups to locate the files or directories they contain.
The inventors of the present application have recognized that this is a severe problem requiring a solution due in part to scalability problems with traditional solutions. The amount of data stored by organizations is expanding at an extraordinary rate.
Traditional solutions of cataloging a file system, such as storing the data in a Relational Database Management System have become impractical. This is especially true when light weight, point-in-time backups are considered. While these backups may be light weight to create and with respect to the storage they occupy, these backups introduce potentially millions of new records into the catalog. The deletion of a backup also requires the catalog to purge potentially millions of records.
This creation and deletion of point-in-time backups can be happening on many thousands of devices within an organization simultaneously. There is no known invention in the prior art that specifically addresses the notion of cataloging constantly changing file system backups on a scale experienced in today's enterprise data centers.