The term “virtual” is used frequently in computer arts, usually followed by the word “machine,” “storage,” or “server.” In this context, “virtual” simply means that an element does not actually exist as a physical element, but rather as software that represents a physical element. Thus, virtual storage signifies an identity of a location for a client's storage, but the physical location where that client's data is stored, while controlled by a virtual identity, is not known to the client, but is known to the virtual machine with which a client interfaces. A virtual storage system may include multiple physical storage devices available for use by a server system to store information specific to one or more client systems. Each server in the server system may comprise multiple virtual machines and each virtual machine may comprise a separate encapsulation or instance of an operating system and one or more applications that execute on the server. As such, each virtual machine may have its own operating system and set of applications, and may function as a self-contained package on a server. Each server (or each virtual machine on a server) may execute an application for sending read/write requests (received from a user via a client system) for accessing data stored on a physical device. For example, a virtual machine application executing on a server in the server system may provide data to a user by receiving the user's access requests, executing the requests, and accessing the storage system to retrieve the requested data. Physical servers or controllers in the storage system access data from physical storage and provide that data in accordance with the requests. Servers in the storage system, like those in the server system, may comprise virtual machines installed thereon.
In virtual storage systems, techniques and mechanisms that facilitate efficient and cost effective storage of large amounts of digital data are common. For example, a network system of storage nodes may be implemented as a data storage system to facilitate the creation, storage, retrieval, and/or processing of digital data. Such a data storage system may be implemented using a variety of storage architectures, such as a redundant array of independent disks (RAID), network-attached storage (NAS) system, a storage area network (SAN), a direct-attached storage system, and combinations thereof. These data storage systems may comprise one or more data storage devices configured to store digital data within data volumes. The storage may be used to create virtual drives that span across physical devices or storage nodes at one location or across a large geographic location. In such systems, virtual drives that span across one or more physical storage resources such as devices, drives, groups of drives, or volumes allow a client to access data from distinct physical storage spaces as if it were doing so from a single physical drive. From a client's point of view, the virtual storage presents as one or more physical storage drives, yet the client does not have a view of the actual physical device storing its data.
Typically, when storage is added to the system or when existing storage is relocated within the system, a client manually selects the location or storage space for the new storage. Consider the case where a client pays for a certain level of storage service from a provider. In that case, the client may initially select an amount of storage space and/or speed at which the purchased storage can be accessed. Afterward, the client could determine that it is willing to pay more money for access to additional and/or faster storage. In such a case, a new a new virtual drive may need to be created. The client examines storage allocated to the client (e.g., perhaps across one or more storage nodes available to the client), determines available physical storage resources, and creates the new virtual drive from available physical storage. However, the client does so without knowledge of how creating the new virtual drive will affect the system's efficiency, and perhaps the speed at which the client may access its stored data. For example, the client may determine that it has several physical storage resources in which to create a new virtual drive, each of which are accessed by different machines, but the client does not know what else those physical resources are doing and thus cannot optimize overall performance.
According to known techniques, the client associates its new virtual drive with a storage controller irrespective of how its choice may impact performance. For example, associating the new virtual drive with a first available controller may create disparate user access patterns such that throughput is never maximized. On the other hand, associating the new virtual drive with a second available controller may enable high de-duplication, thereby saving disk space. In being forced to make an uninformed decision, the client is more likely to place the new virtual drive in a less than optimal location.