It is common nowadays for server and storage systems to be provided remotely on cloud-based systems, i.e. “in the cloud”. Cloud-based systems provide a number of advantages. For example, they enable an organisation to have effectively unlimited storage capacity. They also enable an organisation to avoid or reduce concern regarding updates to systems and the like since the cloud administrator handles these matters. They also allow users access to the systems independently of their physical locations such that an employee for example could locate applications and data in the cloud wherever they were in the physical world.
It is also becoming more common that servers and storage system are provided, “virtually”. Providing a machine virtually means providing a software implementation of a machine (i.e. a computer) that can execute programs like a physical machine. A virtual storage system is a virtually provided storage system. In such a virtual storage system, the definition of virtual storage devices (sometimes referred to as “virtual drives”) and controllers is provided in software and has the effect that to the virtual storage system and the world outside, the virtual storage system appears as a conventional physical storage system. Of course, the virtual storage system is provided on one or more physical machines, but defined independently from the physical machine.
A virtual file system is also provided, which, as is known, is a means to organize data expected to be retained after a program terminates. The file system can provide procedures to store, retrieve and update data as well as manage the available space on the device(s) which contain it. Thus, a file system organizes data in an efficient manner and can be tuned to the specific characteristics of the device.
Although the advantages of cloud-based virtual storage systems are clearly manifest, in practice the process by which changes are made to storage capacity in cloud-based virtual storage systems, can be cumbersome and have a number of disadvantages.
Indeed, some problems can occur when an end-user of a cloud-based service wishes to increase (or decrease) the available virtual-disk space on a group of their machines (for example serving a cloud database application, e.g. Cassandra BigTable data). Ideally the user is able to select the group of machines on some web interface, select the virtual-drives to be changed, and specify how the virtual machines are to be changed. For example, the change could be a request to a virtual storage system service provider to increase the storage capacity to at least 250 GB in each virtual machine. Some time later the user will receive confirmation that this has been done. Ideally there is no interruption to the running services while this is done, but there can be significant delay between the time when a user inputs a request to the service provider to indicate a desire to change the storage capacity of the virtual system, and the time when the service provider acts on the request to implement the change.
Typically within a virtual server there will be provided the components associated with a conventional physical server, but provided virtually. For example, there will be virtual processors, virtual memory, virtual data connectivity and busses etc. An operating system is configured to run on the virtual server and a wrap-around program, often referred to as a “hypervisor”, is also provided at a layer above the virtual operating system. The hypervisor serves to control access to the hardware and memory of the physical storage system or server on which the virtual machine is configured.
Currently the cloud service provider's responsibilities end at the hypervisor. Changing the capacity of a virtual-drive is often not possible because it depends on the file system in use in the virtual machine. To extend the capacity of an existing virtual-drive each virtual-machine would need to be stopped, the virtual-disk-image on the physical-drives grown, i.e. expanded to the required or desired size, and then the file system in the virtual-disk-image grown into the new space. Finally the virtual-machine can be started again.
It is not possible to perform these stages in parallel, i.e. to pipeline this process, as the data on the virtual-disks may not change while it is being copied or grown, so any application using the data would need to be stopped. The cloud service provider may be able to do this on behalf of the customer, but they would require full ‘super user’ access to the virtual machine.
Accordingly, with these problems in mind, to grow virtual-machine file systems today an alternative approach is to add a new drive of the required new capacity, and at an opportune moment, stop the application and copy all the data to the new, larger, virtual drive before starting the application again. When the virtual-machine can next be restarted, the old drive is removed. Storage is a shared resource, in that the storage may be shared between plural virtual storage systems on behalf of plural end users. The storage belongs to the service provider, and in effect is rented out to one or more users. A user will therefore be paying for double the capacity for the time period when the new drive has been added and the old one not yet removed. In other words, until the old virtual drive can be decommissioned, the user will have to pay for its capacity even though it is no longer needed as it has been superseded by the new larger virtual drive.
Some systems have attempted to address this problem. For example, Ext3, a commonly used file system for Linux, has experimental support for ‘online growth’, but not for online shrinking. Restrictions in the hypervisor and virtual operating system still mean that a virtual-drive would have to be removed and re-added for its new size to be discovered by the virtualised operating system. Furthermore, there is no scope for shrinking the file system while it is running.
In some architectures, the cloud services provider has no access to the virtual-machines running on their infrastructure, so any maintenance of the machine contents has to be done by the end-user.
Modern file systems such as Sun's ZFS and Linux's BTRFS have features that allow the file system to dynamically grow and shrink as drives are added and removed. These file systems often allow parts of the file system to be mirrored on different drives (virtual or physical), similar to RAID.
Linux's BTRFS is not yet mature, and ZFS is only properly supported on Solaris, so it is not yet common to use these file systems in a production environment. In fact due to these factors, it is not in fact recommended for any system where the data is important. In the next few years Linux's BTRFS will mature. Accordingly, it is expected that in time Linux's support for functions such as snapshot-ing, quotas and sub-volume management, will be brought to systems that use BTRFS.
Accordingly although it is possible that at some point in the future, once these features are available in cloud virtual-machines, it will be possible to grow and shrink file systems while applications are running, currently it is not. Furthermore, even when this does become possible, some or all of the problems mentioned above regarding user interaction with a virtual storage system will be encountered.
Extending the file-system of a running machine, as per the above use case, can be done by calculating how much extra space is required, attaching a new virtual-drive of the correct size, and extending the BTRFS file system on to it.
The process can also be done in reverse, with the file-system shrunk, and a redundant virtual-drive removed. Data can be compacted, with a new virtual-drive added, and the file system ‘shrunk’ off all the existing virtual-drives. Once redundant virtual-drives are no longer needed they can be removed from the virtual-machine.
FIG. 1 shows a schematic representation of a virtual storage system 4 provided in the cloud. This is an example of Infrastructure as a Service (IaaS), which is a well known cloud computing model. The system 4 is provided on a virtual server and comprises plural virtual disks 6 in a number and size sufficient to provide the specified capacity of the system 4. A virtual file system (not shown) is provided on the plural virtual disks 6. The virtual storage system 4 is provided in the cloud 8, shown schematically. Typically this will be provided in a network such as the internet 10, with the precise physical location of the server on which the virtual disks are configured being largely irrelevant. It will be appreciated that although shown schematically as an actual box 4 forming the virtual server the server itself could be formed as one server amongst many on a larger physical server.
There is hardware 16 that corresponds to the virtual server in that the virtual disks must physically be somewhere, but the actual hardware that forms the disks is irrelevant. In some cases for example, the virtual disk drives 6 forming a single virtual storage system 4 may be distributed on different physical servers 16 or storage systems. Logically though, they form the same virtual storage system. In the example of FIG. 1, the virtual storage system 4 is provided in a distributed manner in that physical storage from each of two physical storage systems 16 is used.
A hypervisor 21 is provided. As explained above, the hypervisor 21 serves to control access to the hardware and memory of the physical storage system or server(s) 16 on which the virtual storage system 4 is configured.
A user server 12 is connected to the internet 10 via some means of communication 14 which could be any known type of connection. For example an IP connection could be used which could be wired or wireless. Through interaction with the user server 12, a user is able to connect to the virtual storage system and exercise control over it. As explained above, if the user decides that additional storage capacity is required on the virtual storage system 4, problems can arise. Conversely, if he decides that the capacity of the virtual storage system 4 is too great and therefore he wants to reduce the capacity somehow, this can also cause problems.