1. Field of the Invention
The present invention relates generally to maintenance of file systems with persistent images for backup structures such as snapshots, and relates more particularly to a system and method for securing information in a persistent image by obscuring the information.
2. Description of Related Art
In enterprise computing environments and other contexts, computer workstations, database servers, web servers and other application servers (collectively hereinafter referred to as “clients”) frequently access data stored remotely from the clients, typically in one or more central locations. Computer networks typically connect the clients to mass storage devices (such as disks) that store the data. Such centralized storage (sometimes referred to as “network storage”) facilitates sharing the data among many geographically distributed clients. Centralized storage also enables information systems (IS) departments to use highly reliable (sometimes redundant) computer equipment to store the data.
Specialized computers located at the central locations make the data stored on the mass storage devices available to the clients. The specialized computers are commonly referred to as file servers, storage servers, storage appliances, etc., such as storage systems available from Network Appliance, Inc., of Sunnyvale, Calif., and collectively hereinafter referred to as “filers.” Software in the filers and other software in the clients communicate according to well-known protocols to make the data stored on the central storage devices appear to users and to application programs as though the data were stored locally on the clients.
The filers present logical “volumes” to the clients. From the perspective of a client, a volume appears to be a single disk drive. However, the volume can represent the storage space in a single storage device, a redundant array of independent disks (commonly referred to as a “RAID set”), an aggregation of some or all of the storage space in a set of storage devices or some other set of storage space. Each volume is logically divided into a number of individually addressable blocks, the same way a disk is divided into blocks (sectors), although the volume blocks can be larger or smaller than disk blocks. The clients issue input/output (I/O) commands to blocks of the volumes, and the filers receive and process the I/O commands. In response to the I/O commands from the clients, the filers issue I/O commands to the appropriate mass storage device(s) to read or write data on behalf of the clients.
In addition, the filers can perform services that are not visible to the clients. For example, a filer can “mirror” the contents of a volume on one or more other volumes. If one “side” of the mirror fails, the filer can continue I/O operations on a remaining mirror side(s), without affecting the clients.
Some filers allow users to take “snapshots” of volumes. These snapshots enable users and system administrators to access data on the volumes, as that data existed at various times in the past, i.e., when the snapshots were taken. For example, snapshots enable users and system administrators to restore files or directories (hereinafter inclusively referred to simply as file system “components”) that have been inadvertently deleted or altered. One practical method of taking snapshots of a volume involves storing information about only blocks of the volume that have changed since the previous snapshot, as described in U.S. Pat. No. 5,819,292 to Hitz, et al.
Each time the filer creates a snapshot, the filer stores information about the state of the volume in a different location on the volume. Thus, each snapshot is separately accessible and represents the state of the volume at the time of the snapshot. Each snapshot is time stamped, or some other mechanism, such as a monotonically increasing “generation number,” is used to keep track of the order in which the snapshots were taken. To conserve storage space, the filer keeps only a limited number of snapshots on the volume. After writing a predetermined number of snapshots, the filer typically reclaims and reuses storage space occupied by older snapshots.
The filer can access the most recent snapshot to bring the volume on line to allow the volume to be accessed by clients after a restart operation. Bringing a volume online is commonly referred to as “mounting” the volume. In some systems, a log of filing transactions is maintained in non-volatile or battery-backed storage, so that transactions can be “replayed” from the log to bring the restored snapshot up to date with the most current information.
Issues involved in operating the centralized storage system include security of information and conditions under which access to information is permitted. For example, important system files or directories used in the operation of file systems or other network storage devices are typically inaccessible to a regular user to prevent accidental or unauthorized manipulation of system information. Other information of a sensitive nature in a centralized network storage system may be restricted to a certain group of users that has been granted permission to access or manipulate the information. Simple examples of sensitive information may include payroll, sales figures, contact lists, confidential information and so forth. Accordingly, filing systems typically provide a mechanism for setting permissions relating to information access, which can typically be allocated on an individual or group basis. File systems also tend to include capabilities for modifying attributes to permit special treatment of components. For example, a component identified as a system file may have the attribute of being hidden from all users except for a system administrator. File system components that include sensitive information can have an attribute set to hide the component from all users except those with selected permissions. For example, directories are sometimes set up with specific access permissions, so that information in the directories receives special treatment, such as being universally available, or available only for specific purposes. In any case, access to information can be configured with permissions on an automatic or manual basis so that large numbers of components can be processed for permissions on a large-scale basis.
A difficulty that arises in the case of file systems that include snapshots is that the permissions set at a given instant may not be reflected in snapshots already taken. For example, if a file is given a hidden attribute at a specific point in time, snapshots taken prior to setting the hidden attribute include versions of the file in which it is not identified as hidden. Similarly, permissions to access, view, modify or list components may be set erroneously, or components may be inadvertently placed in directories that are publicly accessible. Even if errors such as the above are fixed in the active file system, snapshots taken prior to correction of the error include the components with the erroneous settings. If the snapshot components were to be accessed, a user may accidentally, or without authorization, have access to information of a sensitive or confidential nature. A particularly difficult aspect of this problem stems from the read-only nature of the snapshots, which prevents information attributes from being easily changed. Accordingly, the potential for compromised sensitive information is extensive and difficult to correct in the context of a file backup system based on snapshots. It may be possible to add functionality to snapshots to account for sensitive information with specific attributes. However, the added complexity detracts from the simplicity and speed of the pointer driven, consistency point file system layout. The modification to the file system would be impractical in a file system dependent on a snapshot-type recovery configuration.