A storage system is a computer that provides storage service relating to the organization of information on writable persistent storage devices, such as memories, tapes or disks. The storage system is commonly deployed within a storage area network (SAN) or a network attached storage (NAS) environment. When used within a NAS environment, the storage system may be embodied as a file server including an operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on, e.g. the disks. Each “on-disk” file may be implemented as a set of data structures, e.g., disk blocks, configured to store information, such as the actual data for the file. A directory, on the other hand, may be implemented as a specially formatted file in which information about other files and directories are stored.
The storage system may be further configured to allow many server systems to access shared resources, such as files, stored on storage devices of the storage system. Sharing of files is a hallmark of a NAS system, which is enabled because of its semantic
level of access to files and file systems. Storage of information on a NAS system is typically deployed over a computer network comprising a geographically distributed collection of interconnected communication links, such as Ethernet, that allow servers to remotely access the information (files) on the storage system. The servers typically communicate with the storage system by exchanging discrete frames or packets of data according to pre-defined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP).
NAS systems generally utilize file-based access protocols; therefore, each server may request the services of the storage system by issuing file system protocol messages (in the form of packets) to the file system over the network identifying one or more files to be accessed without regard to specific locations, e.g., blocks, in which the data are stored on disk. By supporting a plurality of file system protocols, such as the conventional Common Internet File System (CIFS), the Network File System (NFS) and the Direct Access File System (DAFS) protocols, the utility of the storage system may be enhanced for networking servers.
A SAN is a high-speed network that enables establishment of direct connections between a storage system and its storage devices. The SAN may thus be viewed as an extension to a storage bus and, as such, an operating system of the storage system enables access to stored information using block-based access protocols over the “extended bus”. In this context, the extended bus is typically embodied as Fibre Channel (FC) or Ethernet media adapted to operate with block access protocols, such as Small Computer Systems Interface (SCSI) protocol encapsulation over FC or TCP/IP/Ethernet.
A SAN arrangement or deployment allows decoupling of storage from the storage system, such as an application server, and some level of information storage sharing at the application server level. There are, however, environments wherein a SAN is dedicated to a single server. In some SAN deployments, the information is organized in the form of databases, while in others a file-based organization is employed. Where the information is organized as files, the server requesting the information maintains file mappings and manages file semantics, while its requests (and server responses) address the information in terms of block addressing on disk using, e.g., a logical unit number (LUN). Some SAN arrangements utilize storage systems that implement virtual disks (vdisks), which are encapsulated data containers stored within a file system.
In multi-protocol storage systems that utilize both block-based and file-protocols, typically the block-based protocol utilizes a high-speed transport mechanism, such as Fibre Channel (FC) or InfiniBand (IB). Conversely, file-based protocol connections often utilize, for example, the NFS protocol operating over TCP/IP. The file-based systems typically include additional network overhead due to the nature of the file-based protocols, e.g., NFS or User Datagram Protocol (UDP), involved. This additional network overhead, from, for example, file mapping and management of file semantics, significantly reduces the data throughput available over the file-based protocol network connection.
Users typically desire the ease of use of a file-based protocol, especially the use of the file-based protocol namespace wherein the files are referenced through a conventional drive/volume/path/file name mechanism. In contrast, in a SAN or other block-based environment, data is accessed by reference to a set number of blocks spread among the disks storing the data for the data set, which imposes a greater administrative burden on a user for using SAN-based systems. However, a noted disadvantage of the use of the file-based protocols is the above-mentioned additional network overhead required for the use of such protocols. This additional network overhead makes the use of these file-based protocols impractical for certain high-performance and data-intensive transfer operations, such as database management systems (DBMS). Many users thus desire the ease of use of a file-based protocol namespace, while needing the high-speed data throughput available from a block-based protocol.
A virtual server environment may typically include multiple physical servers accessing the storage system having multiple storage devices for storing client data. Each server may include multiple virtual machines (VMs) that reside and execute on the server. Each VM (sometimes referred to as a virtual server or virtual desktop) may comprise a separate encapsulation or instance of a separate operating system and one or more applications that execute on the server. As such, each VM on a server may have its own operating system and set of applications and function as a self-contained package on the server and multiple operating systems may execute simultaneously on the server.
Each VM on a server may be configured to share the hardware resources of the server. Each server may include a VM monitor module/engine (sometimes referred to as a hypervisor module/engine) that executes on the server to produce and manage the VMs. The VM monitor module/engine (hypervisor) may also virtualize the hardware and/or software resources of the servers for use by the VMs. The operating system of each VM may utilize and communicate with the resources of the server via the VM monitor/hypervisor engine. The virtual server environment may also include a plurality of clients connected with each server for accessing client data stored on the storage system. Each client may connect and interface/interact with a particular VM of a server to access client data of the storage system. From the viewpoint of a client, the VM may comprise a virtual server that appears and behaves as an actual physical server or behaves as an actual desktop machine. For example, a single server may by “virtualized” into 1, 2, 4, 8, or more virtual servers or virtual desktops, each running their own operating systems, and each able to support one or more applications.
A storage system may be configured to allow servers to access its data, for example, to read or write data to the storage system. A server may execute an application that “connects” to the storage system over a computer network such as a shared local area network (LAN), a wide area network (WAN), or a virtual private network (VPN) implemented over a public network such as the Internet. The application may send an access request (read or write request) to the storage system for accessing particular data stored on the storage system. Each server may also include multiple VMs, each VM being used by and connected with a client through a computer network. Each VM may also execute an application for sending read/write requests (received from the connected client) for accessing data on the storage system. The VM applications executing on the server may service the connected clients by receiving the client access requests and submitting the access requests to the storage system for execution.
There are several advantages in implementing VMs on a server. Having multiple VMs on a single server enables multiple clients to use multiple different operating systems executing simultaneously on the single server. Also, multiple VMs executing their own applications may be logically separated and isolated within a server to avoid conflicts or interference between the applications of the different VMs. As each VM is separated and isolated from other VMs, a security issue or application crash in one VM does not affect the other VMs on the same server. Also, VMs can rapidly and seamlessly be shifted from one physical server to any other server, and optimally utilize the resources without affecting the applications. Such a virtualization of the servers, and/or virtualization of the storage network environment, allows for efficiency and performance gains to be realized.
Each VM may be represented by data that describes the VM (referred to herein as “VM data”). VM data for a VM may be used for later producing and deploying the VM on a server. VM data for multiple VMs need to be stored efficiently with minimal use of valuable storage resources. Also, the VM data should be stored in way that allows for fast deployment of the VMs on a server when needed.