A file server is a computer that provides file service relating to the organization of information on writeable persistent storage devices, such memories, tapes or disks. The file server or filer may be embodied as a storage system including a storage operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on, e.g., the disks. Each “on-disk” file may be implemented as set of data structures, e.g., disk blocks, configured to store information, such as the actual data for the file. A directory, on the other hand, may be implemented as a specially formatted file in which information about other files and directories are stored.
A storage system may be further configured to operate according to a client/server model of information delivery to thereby allow many clients to access an application service executed by a server, such as a file server. In this model, the client may comprise an application executing on a computer that “connects” to the file server over a computer network, such as a point-to-point link, shared local area network, wide area network or virtual private network implemented over a public network, such as the Internet. Each client may request the services of the file system on the file server by issuing file system protocol messages (in the form of packets) to the server over the network. It should be noted, however, that the file server may alternatively be configured to operate as an assembly of storage devices that is directly-attached to a (e.g., client or “host”) computer. Here, a user may request the services of the file system to access (i.e., read and/or write) data from/to the storage devices.
One type of file system is a write-anywhere file system that does not overwrite data on disks. If a data block on disk is retrieved (read) from disk into memory and “dirtied” with new data, the data block is stored (written) to a new location on disk to thereby optimize write performance. A write-anywhere file system may initially assume an optimal layout such that the data is substantially contiguously arranged on disks. The optimal disk layout results in efficient access operations, particularly for sequential read operations, directed to the disks. An example of a write-anywhere file system that is configured to operate on a storage system, such as a filer, is the Write Anywhere File Layout (WAFL™) file system available from Network Appliance, Inc., Sunnyvale, Calif. The WAFL file system is implemented as a microkernel within an overall protocol stack of the filer and associated disk storage.
The disk storage is typically implemented as one or more storage “volumes” that comprise a cluster of physical storage devices (disks), defining an overall logical arrangement of disk space. Each volume is generally associated with its own file system. In the WAFL file system, a special directory, called a “qtree”, may be created that has the properties of a logical sub-volume within the namespace of a physical volume. Each file system object (file or directory) is associated with one and only one qtree, and quotas, security properties and other items can be assigned on a per-qtree basis. Each volume has its own file system identifier (ID) and each qtree within a volume has its own qtree ID.
A filer typically includes a large amount of storage (e.g., 6 terabytes) with the ability to support many (thousands) of users. This type of storage system is generally too large and expensive for many applications or “purposes”. Even a typical minimum storage size of a volume (or file system) is approximately 150 gigabytes (GB), which is still generally too large for most purposes.
Rather than utilizing a single filer, a user may purchase a plurality of smaller servers, wherein each server is directed to accommodating a particular purpose of the user. However, there is still the granularity of storage issue since, as noted, the typical minimum storage size of a volume is approximately 150 GB. In addition, the acquisition of many smaller servers may be more costly than the purchase of a single filer. Furthermore, the cost of maintenance and administration of many smaller servers is typically substantially more than the cost of maintaining and administering a single filer. Therefore, it would be desirable to consolidate many servers within a single filer platform in a manner that logically embodies those servers. Server consolidation is thus defined as the ability to provide multiple logical or virtual servers within a single physical server platform. Examples of virtual servers that may be embodied within a single platform are web servers, database servers, mail servers and name servers.
Prior server consolidation solutions provide many independent servers that are essentially “racked together” within a single platform. An example of this solution is the Celerra™ architecture available from EMC® Corporation. The Celerra architecture utilizes the notion of a data mover or an instance of a file server that “front ends” a Symmetrix® storage device having storage resources that can be logically apportioned and assigned to various other data movers. Each data mover has its own set of networking hardware and its own processor running its own copy of the operating system.
Other server consolidation solutions are configured to run multiple instances of a server on a single physical platform. For example, the virtual machine (VM) operating system from IBM® Corporation executes on a mainframe computer system and enables execution of multiple instances of servers on that computer system. Each VM is a complete instantiation of an entire operating system having its own file system and storage resources. That is, there is no sharing of file system and storage resources among the VMs operating on the mainframe. Rather, the storage resources of the computer system are apportioned and dedicated to instances of VMs embodied on the computer.
Another example of a server consolidation solution adapted to run multiple instances of a server process is Samba. Samba is an open-source Windows-compatible server running on a UNIX operating system platform. The Samba server is implemented at the application layer of the system to provide multiple instances of Windows-compatible servers running on the UNIX platform. Samba is an example of a more general “server process replication” based approach for implementing multiple servers on a single physical platform.
Finally, the simplest method of implementing server consolidation is to rename all service (i.e., storage for file servers) units being consolidated such that they are unique among themselves and to configure one instance of the consolidated server that serves all of the service (storage) units. This technique is hereinafter referred to as “server bundling”.
Process-based and server bundling-based server consolidation solutions typically do not provide different security characteristics for each of the server instances running on the platform. For example, Samba has only one set of security characteristics (i.e., a security domain) pertaining to non-replicated parts of its platform infrastructure for all of its instances of Windows-compatible (e.g., NT) servers. An example of a security domain is the Windows NT™ domain security. The NT domain security is an abstraction whereby all users and servers of a domain share the same security information, which may include authorized user and group identifiers (“security objects”) and resources of an operating system.
Broadly stated, the security information is maintained in a security account manager (SAM) database by a Windows NT protected subsystem. The SAM database is equivalent to a combination of /etc/passwd and /etc/group databases on a UNIX® server in that it contains only users and groups local to that server. The notation /etc/passwd, /etc/group denotes a configuration directory indicating a path to a password file and a group file used on the UNIX platform. A user with an account in a particular security domain can log onto and access his or her account from any server in the domain. Typically, one server can be in one security domain (for each access protocol) at a time. Multi-protocol servers can be in as many as one security domain for each access protocol. For example, a multi-protocol filer may be in an NT4 domain for Common Internet File System (CIFS) access and in a Network Information System (NIS) domain for Network File System (NFS) access. As mentioned above, UNIX environments also have security objects that may be different for different servers.
An aspect of Windows™ networking is the notion of a uniform naming convention (UNC) path that defines a way for a user to refer to a unit of storage on a server. A UNC path is prefixed with the string \\ to indicate resource names on a network. For example, a UNC path may comprise a server name, a share name and a path descriptor that collectively reference a unit of storage, such as a share. A share is a shared storage resource, such as a directory on a file system.
In an environment having multiple independent servers used to accommodate multiple security domains, all shares (and share names) must be distinct. However, if those servers are consolidated onto a single platform (by way of server process replication, such as Samba, or server bundling) using alias names for the server, the server names of the UNC paths to the shares may require change, since all share resources would be visible when accessed via any of the alias server names. In other words, although each server may have its own set of shares, users may have to change the UNC path definitions of those shares in order to access their data.
For example, assume that three (3) NT servers are organized such that a first NT server (NT1) and its clients are associated with security domain 1, while second and third NT servers (NT2, NT3) and their clients are associated with security domain 2. Thus, the clients (users) of NT2 and NT3 share the same security database within security domain 2. If a user of domain 2 refers to a particular share (proj1) on NT3, it first attaches to security domain 2 in order to authenticate and then attaches to the server and share on that server. An illustrative example of the notation of the UNC path specified by the user is:\\NT3\data\proj1
Assume now that the servers NT1-3 and their attached resources are consolidated onto one server platform. This raises the issue as to how the clients of a particular domain refer to their data. That is, if each client has a share called proj1 and all of the shares are consolidated within a single platform, there must be a way to differentiate those shares. Migration and reconfiguration of data on storage resources of a server is typically not transparent to a client system; in fact, this issue is often resolved by requiring the client system to change its path descriptor to its data. Such an approach is generally undesirable because it requires action on behalf of the client system. Moreover, if the storage resources are consolidated on one or more servers, it is possible that those servers may not be in the same security domain as the users accessing those resources.
The VM and data mover server consolidation techniques, which are based on partitioning of physical system resources and replication of operating system objects, do not suffer from the two above-mentioned limitations of not being able to consolidate servers from multiple security domains or requiring service (storage) unit renaming. However, these techniques do have other limitations. The operating systems of such server consolidation solutions typically contain software objects that are representative of physical hardware devices, such as network interface cards or well-defined collections of hardware devices. An example of a software object representative of a collection of hardware devices is a file system that allocates space on a set of storage disks available for use by applications running on the operating system.
In a VM or data mover type of system, a static assignment of these hardware representative software objects is typically performed to associate a subset of the hardware resources of the system with a single instance of a VM/data mover (virtual server). The assignment of resources to each virtual server results in hardware resources being directly associated with the specific virtual server, because the software objects that represent these hardware resources are directly assigned to the virtual servers. This direct assignment approach restricts resource management flexibility, particularly with respect to “logical” reassignment of resources among the virtual servers. This also hinders evolutionary growth of the server consolidation implementation because all partitioning and sizing decisions must be made at the time of consolidation and cannot be easily changed later as the consolidated servers evolve.
Server consolidation is particularly useful in the case of a storage server provider (SSP). An SSP serves (“hosts”) data storage applications for multiple users or clients within a single, physical platform or “data center”. The data center is centrally maintained by the SSP to provide safe, reliable storage service to the clients. In a typical configuration, the data center may be coupled to a plurality of different client environments, each having an independent private internal network (“intranet”). Each intranet may be associated with a different client or division of a client and, thus, the data traffic must be separately maintained within the physical platform.
Therefore, the present invention is directed to an architecture that enables instantiation of multiple, secure virtual servers within a single physical filer platform that allows flexible partitioning of physical resources.
The present invention is also directed to an architecture that enables instantiation of multiple, secure virtual servers within a single physical filer platform that is suitable for use in an SSP environment and that provides a high level of security for users.
The present invention is further directed to an architecture that enables encapsulation of a server, including its data and all its configuration information, stored in a form that allows efficient migration and re-instantiation of the server on a different physical platform for use in disaster recovery, data migration or load-balancing.