1. Field of the Invention
This invention relates to a method, system and computer program product for dynamically managing storage requirements of virtualized systems.
2. Description of the Related Art
Current trends in system and application development today involve ever increasing use of virtualized hardware. For example, servers are often virtualized, as are processors and operating systems. An example of a virtualized server is a Virtual Private Server (VPS or Virtual Environment, VE), such as those supplied by SWsoft, Inc. Other examples of virtualized hardware involve Virtual Machines that virtualize a processor and/or an operating system running on the processor. Examples of such Virtual Machines are marketed by, for example, Parallels Software International, Inc., Microsoft Corporation, and VMware, Inc.
One of the issues that needs to be addressed in such systems is the scalability of storage. Although the price of storage continues to fall, user requirements increase at least at the same rate. For example, a single virtual server or Virtual Machine may be allocated on the order of 10 Gigabytes of storage. 10 Gigabytes, by current (2006) standards, is a relatively manageable amount, since commonly available hard drives that store 100-200 Gigabytes can be purchased for a few hundred dollars, or even less.
However, scalability of this brute force approach presents a problem. While one virtual server or Virtual Machine may require only 10 Gigabytes of storage, 100 such Virtual Machines or virtual servers would require one terabyte of storage. This, while commercially available, is relatively expensive—costing on the order of $1,000.
There is therefore a need for storage device virtualization for Virtual Execution Environments (VEEs) of different kinds Since Virtual Execution Environments use generic operating systems, standard procedures for accessing storage devices and presentation of data being stored are also required. At the same time, using storage areas that are dedicated to each environment requires redundant resources and complex technical solutions. Concurrent access to storage device from different Virtual Execution Environments that are invisible to each other may corrupt content of the storage device's service structures, as well as its contents.
The simplest way to organize storage areas for Virtual Execution Environment is to assign a drive partition for each one. This consumes excess storage capacity and does not provide for sharing of data between different VEEs.
Using a set of files for storing data related to virtual storages is preferable, since shared data may be stored as a single file shared among several VEEs.
There are some conventional techniques for handling changed file content:
(1) In traditional file systems, a formatted disk or disk partition of a fixed size is used for storing file data in an organized form. Such structures provide a possibility of fast reads and writes of the data, but require a lot of disk space reserved for the file system, since predefined disk storage space should be separated, e.g. in the form of disk partition. For better stability of operating system even after unexpected crashes, journalled file systems are used. The journalled file maintains a log, or journal, of activity that has taken place in the primary data areas of the disk. If a crash occurs, any lost data can be restored, because updates to the metadata in directories and bitmaps have been written to a serial log.
Examples of journalled file systems include, e.g., NTFS, Linux, ext3, reiserfs, IBM Journaled File System 2 (JFS2), open source JFS project for Linux, and Xfs.
(2) VMware full image files, can be managed as they grow.
The image files of this kind at first are smaller than the imaged disk (virtual storage) of a virtual computer (or of the Virtual Machine, VM), and grow when data is added to the virtual storage. Such image files contain a lot of unnecessary data, since blocks of a predefined size could only be added to the image file, and the contents of the block may contain useless data. Also, deletion of the data from the disk image does not reduce the image file. Handling of such images, with their linear structure, is not a trivial task, since structures like B-trees are not supported, and each write to the image file requires converting block address of virtual disk to block address of real disk where image file is stored.
(3) Database transactions, which are used to control data commitment to databases. For example, in standard account procedures, it is necessary to modify several databases simultaneously. Since computers fail occasionally (due to power outages, network outages, and so on) there is a potential for one record to be updated or added, but not the others. To avoid these situations, transactions are used to preserve consistency of the file as one of the goals.
In some databases, for example, MySQL open source database, the main database engine is implemented using B-trees/B+trees structures. It is used for fast determination of record data placement in files that contain database data.
(4) Encryption software that stores a partition image in file. For example, StrongDisk, and PGPdisk provide this option.
(5) Other software provides the ability to store disk data in archived or modified manner. For example, Acronis True Image™ has the ability to store disk data on a block level.
(6) Windows sparse files with one or more regions of unallocated data in them. In most Unix systems, all files can be treated as sparse files.
Accordingly, there is a need in the art for an effective way to allocate storage in systems running multiple virtualized servers or processors.