1. Field of the Invention
Embodiments of the present invention relate to security for a data storage and retrieval system, and in particular, to security systems and methods involving filtering information in the data storage and retrieval system.
2. Description of Related Art
A data storage and retrieval system typically includes one or more specialized computers (variously referred to as file servers, storage servers, storage appliances or the like, and collectively hereinafter referred to as “filers”). Each filer has a storage operating system for coordinating activities between filer components and external connections. Each filer also includes one or more storage devices, which can be connected with other filers, such as via a storage network or fabric. Exemplary storage devices include individual disk drives, groups of such disks and redundant arrays of independent (or inexpensive) disks (RAID groups). The filer is also connected via a computer network to one or more clients, such as computer workstations, application servers or other computers. Software in the filers and other software in the clients cooperate to make the storage devices, or groups thereof, appear to users of the workstations and to application programs being executed by the application servers, etc., as though the storage devices were locally connected to the clients.
Filers can also perform other services that are not visible to the clients. For example, a filer can treat all the storage space in a group of storage devices as an “aggregate.” The filer can then treat a subset of the storage space in the aggregate as a “volume.”
Filers include software that enables clients to treat each volume as though it were a single storage device. The clients issue input/output (I/O) commands to read data from or write data to the volume. The filer accepts these I/O commands; ascertains which storage device(s) are involved; issues I/O commands to the appropriate storage device(s) of the volume to fetch or store the data; and returns status information or data to the clients.
Typically, the filer manages storage space on the storage devices on a per-volume basis. Disk blocks, which represent contiguous physical units of storage, are composed of a number of bytes and are used as the fundamental storage constructs to store files. A file may be represented as a number of blocks of storage, depending upon the size of the file. The filer keeps track of information related to the volume, files and blocks. For example, the filer tracks the size of the volume, the volume owner, access protection for the volume, a volume ID and disk numbers on which the volume resides. The filer also keeps track of directories, directory and file owners and access protections.
Additional information about files maintained by the filer may include file names, file directories, file owners and protections, such as access rights by various categories of users. The filer may also track file data for reference purposes, which file data may include file handles, which are internal file identifiers, and read offsets and read amounts, which determine location and size of a file. The filer also tracks information about blocks that store data and make up files. For example, the filer may track blocks that are allocated to files, blocks that are unallocated, or free to be used, as well as information about the blocks. In addition, the filer may track internal block data such as block checksums, a block physical location, a block physical logical offset, as well as various special purpose blocks.
An example of a special purpose block is a superblock, which contributes to mounting a volume, where the superblock includes all the information needed to mount the volume in a consistent state. The superblock contains a root data structure commonly known as an index node (“inode”), from which a tree of inodes are located and used to locate other blocks in the volume. The superblock contains information about the volume, such as volume configuration or a volume ID. The volume information may be read into a cache when a given associated volume is mounted.
The above-described information tracked by the filer collectively constitutes a “file system,” as is well-known in the art. For example, the filer can implement the Write Anywhere File Layout (WAFL®) file system, which is available from Network Appliance, Inc. of Sunnyvale, Calif. Alternatively, other file systems can be used.
According to the exemplary WAFL file system, storage space on a volume is divided into a plurality of 4 kilobyte (KB) blocks. Each block has a volume block number (VBN), which is used as an address of the block. Collectively, the VBNs of a volume can be thought of as defining an address space of blocks on the volume.
Each file on the volume is represented by a corresponding inode. Files are cataloged in a hierarchical set of directories, beginning at a root directory. Each directory is implemented as a special file stored on the same volume as the file(s) listed in the directory. Directories list files by name (typically alphabetically), to facilitate locating a desired file. A directory entry for a file contains the name of the file, the file ID of the inode for the file, access permissions, etc. The collection of directory names and file names is typically referred to as a name space. Various name spaces may be created that have specific purposes, where each name space has a root directory called a “metaroot.”
The inodes, directories and information about which blocks of the volume are allocated, free, etc., collectively form system information metadata. An allocated block is one that is assigned to store specific data, while a free block is one that is unallocated and available to be assigned to store specific data. The metadata may be public or private, that is, being typically made available to users or not. Some metadata is stored in specially named files stored on the volume and, in some cases, in specific locations on the volume, such as in a “volume information block,” as is well known in the art. When one or more blocks are to be retrieved from disk, the operating system may retrieve the blocks to a system cache, to satisfy an I/O request made by a client. The operating system executes a routine to communicate with appropriate device drivers to cause the desired blocks to be read from the storage device(s) into the cache.
Some storage operating systems implemented in filers include a capability to take “snapshots” of an active file system. A snapshot is a persistent point in time image of the active file system that enables quick recovery of data after data has been corrupted, lost, or altered. Snapshots can be created by copying the data at each predetermined point in time to form a consistent image. Snapshots can also be created virtually by using a pointer to form the image of the data. A snapshot can also be used as a storage space-conservative mechanism, generally composed of read-only data structures that enables a client or system administrator to obtain a copy of all or a portion of the file system, as of a particular time in the past, i.e. when the snapshot was taken.
As part of ordinary operation of the network storage system, the operating system employs file system filters to realize certain features. Filters may perform tasks related to filer operations, including the storage and retrieval of information. Filter operations may include permitting or preventing access to files or file metadata, for example. Filter tasks may include capturing backup information for preserving data or data transfer or conversion, as may be the case during migrations or upgrades. Examples of file system filters include antivirus products that examine I/O operations for virus signatures or activity, user access permissions, encryption products and backup agents.
File system filters are typically organized as kernel-mode applications that are dedicated to a specific purpose. The file system filter is typically arranged in the I/O data path, to permit the filter to intercept I/O requests and responses. This configuration for a file system filter exhibits several drawbacks, including impacting control flow for I/O operations, little or no control of filter sequence or load order and systematic issues with filters being attached to the operating system kernel. There is also typically no convention with respect to filter construction, so that the filters arranged in a sequence may conduct operations that cause conflicts or errors in the operating system, potentially leading to crashes. Filters may also be constructed to generate their own I/O requests, and may be reentrant, leading to stack overflow issues and significant performance degradation. There may be redundancy among groups of filters for common operations, such as obtaining file/path names, generating I/O requests or attaching to mounted volumes and redirectors, as well as buffer management. Some of these common tasks are performed inconsistently among the various filters, leading to inefficiency and performance degradation.
File system filter architectures have been proposed to overcome some of the above-described drawbacks. According to one configuration, a filter manager is provided between an I/O manager and data storage devices. The filter manager provides services for execution of filter functions while overcoming some of the above-described drawbacks. For example, the filter manager can provide a filter callback architecture that permits calls and responses rather than chained dispatch routines. A call to a filter invokes the filter, often with passed parameters, while a callback represents instructions passed to the filter for execution, often to call other procedures or functions. For example, a callback to a filter may be in the form of a pointer to a function or procedure passed to the filter. The filter manager can also provide uniform operations for common tasks such as generating I/O requests, obtaining file/path names, managing buffers and attaching to mounted volumes and redirectors. In addition, the filter manager can maintain a registry of filters and their specific functionality requirements, such as specific I/O operations. By maintaining a registry of filters, the filter manager can load and unload filters with consistent results and states. In addition, the filters can be enumerated for various filter management functions.
There are several challenges that are not addressed by the above-described filter manager in a file system connected through a network. For example, the above-described filter manager is specific to a local filer, where all requests and responses represent local file access activity. Accordingly, the above-described filter manager is unable to take advantage of available information contained in the request that might indicate request source or aid in request processing. In addition, the above described filter manager is file based, and thus unable to provide filtering services for other types of access methods. For example, one type of advantageous data access protocol operates on a block basis, rather than a file basis. Data access protocols such as iSCSI and Fiber Channel permit direct access to data blocks in a file system, and are unrecognized by the above-described local filer and filter manager. The filter manager is also unable to identify external sources or addresses for requests and responses that might aid in request processing, such as are available in the IP protocol, for example. The filter manager is unable to pass high level protocol information to filters through the filter architecture to obtain the advantages that might be possible with the additional information available for processing requests and responses by the filters. Accordingly, the filter manager does not have context information for the request or response to be processed. Context information refers to information known to a particular object, such as a filter, that the object can use or produce as a result of object processing. For example, CIFS and NFS file protocol based requests include a client source address, which can provide a context for a filter processing the request or response. The above described filter manager does not provide such a context, for example.
In addition, the filter manager is a kernel-mode application, and is synchronous to the extent that the filter manager has limitations related to processing threads that must not act to preserve data or I/O operations in the event of a load or unload command.
One important aspect of managing filer operations involves security of the filer and the data stored within the file system. One approach to implementing security in a filer involves assigning protections to various files, directories or categories of data in the file system. For example, files, directories or data may be assigned permissions individually or as a group, such as by being located in a particular directory that has specific permissions. Alternately, or in addition, users may be assigned to specific groups that have a defined access to certain files, directories or data. Users that are not part of the group do not have the same access permissions to the files, directories, or data. Individual files or directories may also be assigned permissions to permit or prevent access by individual users or groups of users. Access to a file system often occurs over a network that connects multiple users with multiple data storage devices organized by filers. Networks typically require access permissions through logins that identify a user and provide security in the form of the requirement of an authorized user ID and/or password.
At times, intruders may attempt to access the network or connected file system without authorization. Intruders may adopt a number of different techniques to attempt to gain access to the network or file system, for example by impersonating an authorized user, attempting to overcome security provisions, such as logins or access permissions, or by modifying security data to obtain access to the network or file system. One difficulty identified in security for a file system is the fact that by denying access to an intruder through one or more security provisions, the intruder is able to improve their knowledge of the security features of the network or file system. As the intruder attains greater knowledge about the security of the network or file system, they are more likely to successfully gain unauthorized access.