A backup system (available from, e.g., Dell EMC or NetApp Inc.) typically comprises one or more storage devices into which information may be entered, and from which information may be obtained, as desired. The backup system includes a storage operating system that functionally organizes the system by, inter alia, invoking storage operations in support of a storage service implemented by the system. The backup system may be implemented in accordance with a variety of storage architectures including, but not limited to, a network-attached storage (NAS) environment, a storage area network (SAN) and a disk assembly directly attached to a client or host computer. The storage devices are typically disk drives organized as a disk array, wherein the term “disk” commonly describes a self-contained rotating magnetic media storage device. The term disk in this context is synonymous with hard disk drive (HDD) or direct access storage device (DASD).
The backup system may be configured to operate according to a client/server model of information delivery to thereby allow many clients to access the directories, files and blocks stored on the system. In this model, the client may comprise an application, such as a database application, executing on a computer that “connects” to the backup system over a computer network, such as a point-to-point link, shared local area network, wide area network or virtual private network implemented over a public network, such as the Internet. Each client may request the services of the file system by issuing file system protocol messages (in the form of packets) to the backup system over the network. By supporting a plurality of file system protocols, such as the conventional Common Internet File System (CIFS) and the Network File System (NFS) protocols, the utility of the backup system is enhanced.
Each data container, such as a file, directory, etc., within a file system is typically associated with an inode that serves as the root of a buffer tree of the data container. The buffer tree is an internal representation of blocks for the data container stored in the memory of the backup system and maintained by the file system. The inode is a data structure used to store information, such as metadata, about the data container, whereas the data blocks are structures used to store the actual data for the container. The inode typically contains a set of pointers to other blocks within the file system. For data containers, such as files, that are sufficiently small, the inode may directly point to blocks storing the data of the file. However, for larger files, the inode points to one or more levels of indirect blocks, which, in turn, may point to additional levels of indirect blocks and/or the blocks containing the data.
When a backup is performed, the data stream from the client to the backup system may be in an inode-based format that is very efficient. However, the inode-based format does not lend itself to generating searchable metadata, because metadata pertaining to a single object (file or directory) that is useful in a search is scattered around a number of records, which may be apart in the data stream.