Today's computers require memory to hold or store both the steps or instructions of programs and the data that those programs take as input or produce as output. This memory is conventionally divided into two types, primary storage and secondary storage. Primary storage is that which is immediately accessible by the computer or microprocessor, and is typically though not exclusively used as temporary storage. It is, in effect, the short term memory of the computer.
Similarly, secondary storage can be seen as the long-term computer memory. This form of memory maintains information that must be kept for a long time, and may be orders of magnitude larger and slower. Secondary memory is typically provided by devices such as magnetic disk drives, optical drives, and so forth. These devices present to the computer's operating system a low-level interface in which individual storage subunits may be individually addressed. These subunits are often generalized by the computer's operating system into “blocks,” and such devices are often referred to as “block storage devices.”Block storage devices are not typically accessed directly by users or (most) programs. Rather, programs or other components of the operating system organize block storage in an abstract fashion and make this higher-level interface available to other software components. The most common higher-level abstraction thus provided is a “file system.” File systems include, for example, document management systems (in systems such as these certain files are sometimes referred to as documents) including Microsoft Sharepoint, EMC Documentum, IBM File Net, etc.; archive systems (in systems such as these certain files are sometimes referred to as objects) including Symantec's Enterprise Vault, EMC Email Extender, Mimosa, AXS-ONE, etc.; email servers (in systems such as these certain files are sometimes referred to as emails) including, for example, Microsoft Exchange, IBM Lotus Notes etc.; desktops and notebook computers; Content Addressable Storage Platforms (in systems such as these certain files are sometimes referred to as objects), including, for example, EMC's Centera, IBM's DR550, NetApps Snaplock, Hitachi's HDS, etc.
In a file system, the storage resource is organized into directories, files, and other objects. Associated with each file, directory, or other object is typically a name, some explicit/static metadata such as its owner, size, and so on, its contents or data, and an arbitrary and open set of implicit or “dynamic” metadata such as the file's content type, checksum, and so on. As is known in the art, metadata is basically “data about data.” Directories are basically containers that provide a mapping from directory-unique names to other directories and files. Files are basically containers for arbitrary data. Because directories may contain other directories, the file system client (human user, software application, etc.) perceives the storage to be organized into a quasi-hierarchical structure or “tree” of directories and files. This structure may be navigated by providing the unique names necessary to identify a directory inside another directory at each traversed level of the structure; hence, the organizational structure of names is sometimes said to constitute a “file system namespace.”
File systems support a finite set of operations (such as create, open, read, write, close, delete, etc.) on each of the abstract objects which the file system contains. For each of these operations, the file system takes a particular action in accordance with the operation in question and the data provided in the operation. The sequence of these operations over time affects changes to the file system structure, data, and metadata in a predictable way. The set of file system abstractions, operations, and predictable results for particular actions can be considered as “semantics” for the file system. While particular file systems differ slightly in their precise semantics, in general file systems implement as a subset of their full semantics a common semantics. This approximately equivalent common semantics can be regarded as the “conventional” or “traditional” file system semantics.
Storage resources accessed by some computer, its software or users need not be “directly” attached to that computer. Various mechanisms exist for allowing software or users on one computing device to access over a network and use storage assets that are actually located on another remote computer or device. There are many types of remote storage access facilities, but they may without loss of generality be regarded as falling into one of two classes: block-level and file-level. File-level remote storage access mechanisms extend the file system interface and namespace across the network, enabling clients to access and utilize the files and directories as if they were local. Such systems are therefore typically called “network file systems.” Note that the term “network file system” is used herein generally to refer to all such systems—there is a network file system called Network File System or NFS, originally developed at Sun Microsystems and now in the public domain. When discussing the general class of such systems herein, the lower-case term, for example, “networked file systems” will be used. When discussing the specific Sun-developed networked file system, the fully capitalized version of the term or its acronym, for example, “Network File System or NFS” will be used.
Networked file systems enable machines to access the file systems that reside on other machines. Architecturally, this leads to the following distinctions: in the context of a given file system, one machine plays the role of a file system “origin server” (alternatively, “file server” or “server”) and another plays the role of a file system client. The two are connected via a data transmission network. The client and server communicate over this network using standardized network protocols; the high-level protocols which extend the file system namespace and abstractions across the network are referred to as “network file system protocols.” Exemplary file system protocols include the Common Internet File System (CIFS), the aforementioned NFS, Novell® Netware file sharing system, Apple® AppleShare®, the Andrew File System (AFS), and the Coda File system (Coda®). These network file system protocols share an approximately equivalent semantics and set of abstractions, but differ in their details and are not interoperable. Thus, to use a file system from a file server, a client must “speak the same language,” i.e., have software that implements the same protocol that the file server uses.
A file server indicates which portions of its file systems are available to remote clients by defining “exports” or “shares.” To access a particular remote file server's file systems, a client must then make those exports or shares of interest available by including them by reference as part of their own file system namespace. This process is referred to as “mounting” or “mapping (to)” a remote export or share. By mounting or mapping, a client establishes a tightly coupled relationship with the particular file server. The overall architecture can be characterized as a “two-tier” client-server system, since the client communicates directly with the server which “has” the resources of interest to the client.
In addition to organizing and maintaining the relationships between file system clients and file systems, additional challenges exist in managing access to and utilization of file systems. One of the main concerns relates to the implementation of security or access controls with respect to the data stored in conjunction with these various file systems. More particularly, after a review of the above discussion it may be ascertained that a wide variety of users at a wide variety of location may be able to request objects (for example, files or other things stored) within one or more file systems. In many cases, however, it may be desirable to control the access of the various users to the data within the file systems.
This concern may be particularly germane when allowing access to certain data by those outside of an organization associated with the file systems, for example various auditors, compliance officers, legal counsel, etc. In fact, it may be desired to limit the access of such users only to those file which comprise data which may be related to their specific jobs or tasks.
Imposing these sorts of access controls on such users (or other users for that matter) may present a significant challenge because of both the organization of the typical file system and the fact that the objects stored in conjunction with a file system may comprise unstructured or semi-structured data. As such, it may only be possible to control access of a user to file system objects based upon the location of those object.
Accordingly, improved systems and methods for controlling access to objects of a file system are desired.