A storage system typically comprises one or more storage devices into which information may be entered, and from which information may be obtained, as desired. The storage system includes a storage operating system that functionally organizes the system by, inter alia, invoking storage operations in support of a storage service implemented by the system. The storage system may be implemented in accordance with a variety of storage architectures including, but not limited to, a network-attached storage environment, a storage area network, and a disk assembly directly attached to a client or host computer. The storage devices are typically disk drives organized as a disk array, wherein the term “disk” commonly describes a self-contained rotating magnetic media storage device. The term disk in this context is synonymous with hard disk drive (HDD) or direct access storage device (DASD).
The storage operating system of the storage system may implement a high-level module, such as a file system, to logically organize the information stored on volumes as a hierarchical structure of data containers, such as files and logical units. For example, each “on-disk” file may be implemented as set of data structures, i.e., disk blocks, configured to store information, such as the actual data for the file. These data blocks are organized within a volume block number (vbn) space that is maintained by the file system. The file system may also assign each data block in the file a corresponding “file offset” or file block number (fbn). The file system typically assigns sequences of fbns on a per-file basis, whereas vbns are assigned over a larger volume address space. The file system organizes the data blocks within the vbn space as a “logical volume”; each logical volume may be, although is not necessarily, associated with its own file system.
A known type of file system is a write-anywhere file system that does not overwrite data on disks. If a data block is retrieved (read) from disk into a memory of the storage system and “dirtied” (i.e., updated or modified) with new data, the data block is thereafter stored (written) to a new location on disk to optimize write performance. A write-anywhere file system may initially assume an optimal layout such that the data is substantially contiguously arranged on disks. The optimal disk layout results in efficient access operations, particularly for sequential read operations, directed to the disks. An example of a write-anywhere file system that is configured to operate on a storage system is the Write Anywhere File Layout (WAFL™) file system available from Network Appliance, Inc., Sunnyvale, Calif.
The storage system may be further configured to operate according to a client/server model of information delivery to thereby allow many clients to access data containers, such as files and logical units, stored on the system. In this model, the client may comprise an application, such as a database application, executing on a computer that “connects” to the storage system over a computer network, such as a point-to-point link, shared local area network (LAN), wide area network (WAN), or virtual private network (VPN) implemented over a public network such as the Internet. Each client may request the services of the storage system by issuing file-based and block-based protocol messages (in the form of packets) to the system over the network.
A plurality of storage systems may be interconnected to provide a storage system cluster configured to service many clients. Each storage system or node may be configured to service one or more volumes, wherein each volume stores one or more data containers. Communication among the nodes involves the exchange of information between two or more entities interconnected by communication links. These entities are typically software programs executing on the nodes. The nodes communicate by exchanging discrete packets or messages of information according to predefined protocols. In this context, a protocol consists of a set of rules defining how the nodes interact with each other.
Each node generally provides its services through the execution of software modules, such as processes. A process is a software program that is defined by a memory address space. For example, an operating system of the node may be implemented as a single process with a large memory address space, wherein pieces of code within the process provide operating system services, such as process management. Yet, the node's services may also be implemented as separately-scheduled processes in distinct, protected address spaces. These separate processes, each with its own process address space, execute on the node to manage resources internal to the node and, in the case of a database or network protocol, to interact with a variety of network elements.
Services that are part of the same process address space communicate by accessing the same memory space. That is, information exchanged between services implemented in the same process address space is not transferred, but rather may be accessed in a common memory. However, communication among services that are implemented as separate processes is typically effected by the exchange of messages. For example, information exchanged between different address spaces of processes is transferred as one or messages between different memory spaces of the processes. A known message-passing mechanism provided by an operating system to transfer information between process address spaces is the Inter Process Communication (IPC) mechanism.
Resources internal to the node may include communication resources that enable a process on one node to communicate over the communication links or network with another process on a different node. The communication resources include the allocation of memory and data structures, such as messages, as well as a network protocol stack. The network protocol stack, in turn, comprises layers of software, such as a session layer, a transport layer and a network layer. The Internet protocol (IP) is a network layer protocol that provides network addressing between nodes, whereas the transport layer provides a port service that identifies each process executing on the nodes and creates a connection between those processes that indicate a willingness to communicate. Examples of trans-port layer protocols include the Transmission Control Protocol (TCP) and other reliable connection protocols.
Broadly stated, the connection provided by the transport layer, such as that provided by TCP, is a reliable, securable logical circuit between pairs of processes. A TCP process executing on each node establishes the TCP connection in accordance with a conventional “3-way handshake” arrangement involving the exchange of TCP message or segment data structures. The resulting TCP connection is identified by port numbers and IP addresses of the nodes. The TCP transport service provides reliable delivery of a message using a TCP transport header. The TCP protocol and establishment of a TCP connection are described in Computer Networks, 3rd Edition, particularly at pgs. 521-542, which is hereby incorporated by reference as though fully set forth herein.
The session layer manages the establishment or binding of an association between two communicating processes in the nodes. In this context, the association is a session comprising a series of interactions between the two communicating processes for a period of time, e.g., during the span of a connection. Upon establishment of the one or more connections, the processes take turn exchanging information, such as commands and is data, over the session, typically through the use of request and response messages in accordance with a network protocol.
The storage system may be configured to operate with a plurality of file-level protocols, such as the Common Internet File System (CIFS) and the Network File System (NFS) protocols to thereby enhance the utility of the system for networking clients. As such, the storage system is typically configured with a CIFS server and/or an NFS server. The NFS protocol is typically utilized by Unix-based clients to access data sets served by the NFS server, whereas the CIFS protocol is typically associated with Microsoft Windows-based clients serviced by the CIFS server. NFS and CIFS utilize one or more authentication techniques for identifying access limitations to a particular data set, such as a file.
Specifically, the NFS protocol utilizes a conventional network information services (NIS) set of attributes. As such, the terms NFS attributes and NIS attributes shall be used interchangeably herein, however it is understood that NIS encompasses more than just NFS. NFS utilizes a user identifier (UID) and a primary group identifier (GID) for authentication. To that end, the UID and GIDs are sent from the client to the NFS server in a conventional NFS credential with every NFS operation containing a data access request. The NFS server compares the received UID and/or GID with permissions associated with a particular file. The NFS server does not perform any additional authentication, but simply accepts the UID/GID that is asserted by the client when sending the data access request. In an exemplary NFS environment, the permissions associated with a file are stored as mode bits, which are divided into three fields, namely the permissions associated with the owner, with the group, and with others. Each of the three fields contains three bits, one for read access, one for write access, and one for execute permission. NFS mode bits for permissions are further described in Request for Comments 1094: Network File System Protocol Specification, by Bill Nowicki, March 1989, the contents of which are hereby incorporated by reference.
Additionally, one technique for improving the authentication of NFS requests is the use of NFS-Kerberos. In a conventional NFS-Kerberos implementation, the client is transmits a conventional Kerberos ticket to the NFS server of the storage system to assert its name, and the storage system constructs an appropriate file system credential from the asserted Kerberos ticket. (Notably, all clients communicating with the NFS server in this manner must support NFS-Kerberos as a Kerberos ticket is inserted into each NFS request sent to the server.)
The CIFS protocol does not trust the client to transmit the correct credentials with a data access request. In a CIFS environment, user identifiers are not UIDs as utilized by NFS but comprise security identifiers (SIDs), which are unique on a worldwide basis. One or more identification authorities authenticate a given SID, as described further below. When a CIFS command arrives at the CIFS server, the credential is compared with an access control list (ACL). An ACL consists of zero or more access control entries (ACE). Each ACE consists of a SID, which identifies the person or group to which the ACE applies, and a set of permissions, which can identify access allowance or denial. Thus, an ACE may identify a particular SID and denote that access is not to be granted to the person(s) identified by that SID.
Credentials, generally, are well understood by those skilled in the art. Broadly stated, a credential is information that identifies an authenticated user or machine (“requesters”). For instance, an authenticating device may receive a request from a user (e.g., a client device) to access particular data in storage. The authenticating device authenticates the requester's identity and associates a corresponding credential with the requester. Notably, the credential may be stored locally to the authenticating device or in one or more external credential databases (e.g., Lightweight Director Access Protocol or “LDAP” servers, etc.). Once the requester is authenticated, the request may be passed to a data access device (e.g., responsible for communicating with the data in storage) along with the corresponding credential, which is the authenticated requester identity used to process the request. (A passed credential typically contains the identity of a single authenticated user for a single domain and domain type, as will be understood by those skilled in the art.)
As noted, there are different styles of credentials based on the particular operating environment(s) used. For instance, CIFS and NFS protocols (e.g., Windows and Unix, respectively) may each utilize a particular style of credential unique to their respective environments. Generally, credentials are variable in length, depending upon the relevant information stored therein. Sometimes, credentials may be large and complicated (e.g., especially very large CIFS credentials) and as such, a large amount of bandwidth may be required to (inefficiently) transmit the credentials between the authenticating device and the data access device. In addition, it may be particularly burdensome on processing resources (e.g., CPU, memory, etc.) at both devices to marshal, transfer, and unmarshal the sometimes large and complicated credentials. That is, to exchange the credentials between the devices, the credentials may need to be generated/collected and manipulated to comply with a transmission exchange protocol (e.g., packets) at a first device, and then received and re-manipulated (e.g., back to an original credential form) by a second device, as will be appreciated by those skilled in the art.