File system administrators try to protect data against unrecoverable damage to file systems and storage systems that host the data. System administrators may employ backup utilities, copy-on-write file system snapshot utilities, and other data protection tools in this effort. A snapshot utility may, in real time, record persistent copies of changes to customer file data. These snapshot copies provide a space efficient approach to capture consistent states of active files undergoing changes. The snapshot copies can be used in both an on-line disk based backup solution, as well as the source of an off-line backup operation that protects against file system failure. Current data protection solutions, however, continue to require significant investments in hands-on administration, time allocation, and storage space consumption. Conventional protection schemes may support managing customer data with different granularities. In one case, all files may be backed up periodically and/or in response to an administrator-initiated action. In another case, only changed files may be backed up. Therefore, recovery scenarios may involve restoring all files or all changed files even though only some data in some file(s) may be required to effect recovery.
Some conventional systems may support individual file backup and recovery. However, these conventional systems may require exacting individual configuration, manipulation, and maintenance by a systems administrator having up-to-date backup and recovery plans at hand. The configuration and maintenance may include identifying the exact, fully qualified pathname location of a file to be backed up and/or recovered. Such pathnames are typically not presented in a graphical user interface, but rather are acquired from text-based command-line interfaces. Additionally, information concerning files to be backed up is typically not stored with the file itself, but rather resides in a backup utility data structure. If only selected files or subsets of files from a customer data set are required to be restored during a recovery operation, conventional solutions may require significant management, time, overhead, and unnecessary storage allocation for files that are swept up in over-inclusive conventional solutions. For example, if only 10% of the files in a given file system are required to be included in a backup set, then significant savings may result from the use of tag based snapshots as only 10% of the files would participate in the CPU, memory, and I/O processing associated with the Copy-On-Write semantics of snapshot management. Additional savings will result during subsequent restoration processing of files from the tag based backup set containing only 10% of the total file system population.