The present invention relates to a method for managing a logical volume for minimizing a size of metadata and dynamic resizing, and a computer-readable recording medium storing a program or data structure for embodying the method; and, more particularly, to a method for managing a logical volume in order to support dynamic online resizing of a logical volume and to minimize the size of metadata managed by the logical volume manager that overcomes a physical limitation of a storage device in computer systems, and a computer-readable recording medium storing a program or data structure for embodying the method.
The logical volume manager provides a logical volume which is one virtual disk drive and includes multiple physical disk drives, and implements RAID (Redundant Array of Independent Disks) technique with software to construct the logical volume.
First, RAID and related terms will be explained.
RAID is a way of storing the same data to different locations of multiple hard disks and it is usually utilized in a server with important data. As duplicating and storing the same data to different locations of multiple numbers of hard disks, computing performance is improved by maintaining equilibrium of input/output (I/O) processing and synchronizing input/output processing. Since multiple hard disks increase Mean Time Between Failures (MTBF) and multiple copies of the same data on different locations on the multiple hard disks, fault tolerance of the computer system is also increased despite hard disk malfunctioning.
By placing data on multiple disks, I/O operations can overlap in a balanced way, improving performance. Since the use of multiple disks increases the mean time between failure, storing data redundantly also increases fault-tolerance.
A RAID configuration appears to the operating system to be a single logical hard disk. By utilizing a striping technique in RAID, RAID makes possible varying numbers of partitions within one sector, for example, 512 bytes to several megabytes, on a storage space of hard disks. The stripes of all the disks are interleaved and addressed in order. Striping of all disks may be interleaved and orderly addressed.
In a computer system storing huge data such as a picture in the medical or scientific field, stripes are typically set up to be small size, 512 bytes, so that a single record spans all disks and can be accessed quickly by reading all disks at the same time.
In a multi-user system, better performance requires establishing a stripe wide enough to hold the typical or maximum size record. This allows overlapped disk Input/Output across drives.
There are at least nine types of RAID plus a non-redundant array (RAID-0).
RAID-0: This technique has striping but no redundancy of data. It offers the best performance but no fault-tolerance.
RAID-1: This type is also known as disk mirroring and includes at least two drives that duplicate the storage of data. There is no striping. Read performance is improved since either disk can be read at the same time. Write performance is the same as for single disk storage. RAID-1 provides the best performance and the best fault-tolerance in a multi-user system.
RAID-2: This type uses striping across disks with some disks storing error checking and correcting information. It has no advantage over RAID-3.
RAID-3: This type uses striping and dedicates one drive to storing parity information. The embedded error checking information is used to detect errors. Data recovery is accomplished by calculating the exclusive OR of the information recorded on the other drives. Since an Input/Output operation addresses all drives at the same time, RAID-4 cannot overlap I/O. For this reason, RAID-3 is best for a single-user system with long record applications.
RAID-4: This type uses large stripes, which means records from any single drive may be read. This allows one to take advantage of overlapped Input/Output for read operations. Since all write operations have to update the parity drive, no Input/Output overlapping is possible. RAID-4 offers no advantage over RAID-5.
RAID-5: This type includes a rotating parity array, thus addressing the write limitation in RAID-4. Thus, all read and write operations can be overlapped. RAID-5 stores parity information but not redundant data(but parity information can be used to reconstruct data). RAID-5 requires at least three and usually five disks for the array. It is best for multi-user systems in which performance is not critical or which do few write operations.
RAID-6: This type is similar to RAID-5 but includes a second parity scheme that is distributed across different drives and thus offers extremely high fault- and drive-failure tolerance. There are few or no commercial examples currently.
RAID-7: This type includes a real-time embedded operating system as a controller, caching via a high-speed bus, and has other characteristics of a stand-alone computer. One vendor offers this system.
RAID-10: This type offers an array of stripes in which each stripe is a RAID-1 array of drives. This offers higher performance than RAID-1 but at much higher cost.
RAID-53: This type offers an array of stripes in which each stripe is a RAID-3 array of disks. This offers higher performance than RAID-3 but at much higher cost.
Disk striping will now be explained.
Striping is the process of dividing logically continuous data segments such as a single file and storing the divided segments into physically separated devices such as disk drives using a round robin technique. If a processor has the ability to write and read data faster than to receive and apply data from/to a single disk, striping is a useful technique.
Data is divided into unique sized bytes or sectors and stored over several drives. For example, if there are four drives designed to operate with overlapping read/write operation, generally, four sectors can be read in the same time of reading one sector.
Disk Striping is not provided for fault tolerance or error checking but it can be used for such functions with other techniques.
Striping can be used with mirroring.
Mirroring is a process for duplicating and storing data to more than one device for preventing damaged data in case of malfunctioning devices.
It can be embodied in hardware or software. A RAID system generally has a mirroring function. Operating systems including the Novell Network operating system provide disk mirroring as software. If mirroring is applied to a magnetic tape storage system, it is usually called twinning. Another method, which is cheaper than mirroring, for minimizing data damage is backing up data to magnetic tape at fixed periods.
Based on the above-mentioned terms, the pre-existing technique will be explained in the following paragraphs.
Currently, storage devices with hardware RAID are used for providing better performance, fault tolerance, data recovery from disk error, and to overcome limitations of disk drive size. Such a hardware RAID device has several advantages but it also has an important disadvantage. It is too expensive.
Moreover, it is physically impossible to connect a large number of disk drives as one device so a hardware RAID device also has a limitation of possible applicable storage space.
For overcoming the foregoing problems of the hardware RAID device, a logical volume manager that implements software RAID, has been developed. The logical volume manager is an intermediate level block device driver implementing the various RAID techniques in software based on the calculating ability of a computer, and treats several physically separated independent disk drives as one disk drive.
Pre-existing logical volume managers have been using a fixed mapping method that uses a fixed convert function in converting the logical address used by a high-level module, such as in file systems and general data managers, to a physical address of several underlying physical disk drives. This method has a limitation in flexibility since it fixes the relationship between logical address and physical address. Therefore, it has a problem accepting frequently requested services such as online resizing when using storage space.
Some logical volume managers don""t use a mapping function when a logical address is mapped to a physical address, but use a table-based method for mapping a logical address to physical address.
However, conventional logical volume managers have problems in that metadata is too large to manage in huge storage structures and processing speed is too slow when modifying metadata. Also, for managing a logical volume, the huge size of metadata delays system booting time and uses too much memory.
Baranovsky et al. teaches a logical volume manager that provides logical storage space without limitation of physical storage device in U.S. Pat. No. 5,897,661 under the title of xe2x80x9cLogical volume manager and method having enhanced update capability with dynamic allocation of storage and minimal storage of metadata informationxe2x80x9d. However, the Baranovsky et al. patent has problems of having too much metadata, delaying processing speed, delaying system-booting time, and using too much memory.
The Baranovsky et al. patent uses storage space information, which is located on a memory when a system is booting and is for putting the information to memory, for every process when conversion of a physical address to a logical address has to be performed. Therefore, the Baranovsky et al. patent has problems of having too much metadata, delaying system-booting time, and using too much memory.
It is, therefore, an object of the present invention to provide a logical volume manager for managing a logical volume using a mapping table storing a relation between a physical address and a logical address, using a minimum space for metadata and supporting online dynamic resizing.
It is also another object of the present invention to provide a computer-readable recoding medium storing a program or data structure for embodying the method.
In accordance with an aspect of the present invention, there is provided a method for managing a logical volume in order to support dynamic online resizing and minimizing a size of metadata, the method including the steps of: a) creating the logical volume by gathering disk partitions in response to a request for creating the logical volume in a physical storage space; b) generating the metadata including information of the logical volume and the disk partitions forming the logical volume and storing the metadata to the disk partitions forming the logical volume; c) dynamically resizing the logical volume in response to a request for resizing, and modifying the metadata on the disk partitions forming the logical volume; and d) calculating and returning a physical address corresponding to a logical address of the logical volume by using mapping information of the metadata containing information of the physical address corresponding to the logical address.
Also, the present invention, in a storage system with a processor, is directed to providing a computer-readable recording medium storing a program or data structure for embodying the method and comprises functions for: a) creating a logical volume by gathering disk partitions in response to a request for creating the logical volume in a physical storage space; b) generating the metadata including information of the logical volume and the disk partitions forming the logical volume and storing the metadata to the disk partitions forming the logical volume; c) dynamically resizing the logical volume in response to a request for resizing, and modifying the metadata on the disk partitions forming the logical volume; and d) calculating and returning a physical address corresponding to a logical address of the logical volume by using mapping information of the metadata containing information of the physical address corresponding to the logical address.
The present invention maintains a mapping table separately without using a fixed mapping function when a logical address, which is used by a file system, a database and/or a data managing system for overcoming problems of currently used RAID systems, is mapped to the physical address of a real physical disk drive corresponding to the logical address.
By providing flexibility of mapping, volume size can be dynamically increasing and decreasing effectively while operating a system and a RAID level of the volume can be applied to a newly added storage space.
Also, operations for metadata modification can be performed effectively by minimizing necessary metadata for managing a logical volume, and memory can have more space for other operations based on minimized metadata.
The present invention manages a logical volume by using minimum space for system metadata and modifies metadata by using minimum processing overhead. The present invention can also modify metadata with simple operation, manage huge storage, and provide various functions including on-line resizing, in response to a user""s request, during system operation.
Also, the present invention can provide information on another disk despite malfunctioning of one of the physical disks. The information includes not only normal information but also system managing information. The present invention also warns users when one of the devices is unavailable for use because of error.
The present invention provides a dynamic mapping method for modifying mapping between a logical address used in a high-level module and a physical address of a physical disk device as needed. Also the present invention minimizes the size of metadata including mapping table information needed for managing a logical volume. Additionally, the present invention minimizes the modifying request on the metadata and then can support various online management functions including a resizing request occurring during use of the storage space with minimum cost. By minimizing the size of metadata managed in a memory while the system is operating, the present invention can manage a super huge storage system.
Additionally, in a malfunction situation in a disk of physical devices managed together logically, a requested operation for another physical disk can be serviced continuously.