1. Field of the Invention
The present invention relates to storage systems. More particularly, the invention relates to a method and system for managing storage systems containing multiple data storage devices.
2. Background
Conventional data storage systems include one or more storage devices connected to a controller or manager. As used herein, the term xe2x80x9cdata storage devicexe2x80x9d refers to any device or apparatus utilizable for the storage of data, e.g., a disk drive. For explanatory purposes only and not as an intent to limit the scope of the invention, the term xe2x80x9cdisk drivexe2x80x9d will be used throughout this document instead of the term xe2x80x9cdata storage device.xe2x80x9d
A logical volume manager (also called a logical disk manager) can be used to manage storage systems containing multiple disk drives. The logical volume manager configures a pool of disk drives into logical volumes (also called logical disks) so that applications and users interface with logical volumes instead of directly accessing physical disk drives. One advantage of using a logical volume manager is that a logical volume may span multiple physical disks, but is accessed transparently as if it were a single disk drive. These logical volumes appear to other components of the computer system as ordinary physical disk drives, but with performance and reliability characteristics that are different from underlying disk drives.
The logical volume manager divides a physical disk drive into one or more partitions (also known as extents or subdisks). Each logical volume is composed of one or more partitions and each partition is typically defined by an offset and length. Because of the overhead inherent in managing multiple partitions, conventional systems normally have severe limitations on the number of partitions that can be formed on a physical disk drive. The practical limit in conventional systems is normally less than 100 (and often less than 10) partitions on a single disk drive. Due to the nature of the data structures and algorithms used by conventional volume managers, the maximum number of partitions or subdisks permitted to a logical volume in conventional systems is usually much less than 5000. In the simplest case, the disk manager forms a logical volume from a single partition. In more complex cases, the disk manager may form logical volumes by concatenating multiple partitions.
Each partition can, and typically does, have a different length. When a logical volume is no longer needed, its partitions are deleted so that space on the disk drives is made available for another partition to be created. However if a new partition is larger than the available space, then the space cannot be reused for the new partition. If the new partition is smaller than the available space, then a portion of the free space will be used and an even smaller piece will remain free. Over time, this results in many small pieces of free space that cannot be reused. This problem is often referred to as xe2x80x9cfragmentation.xe2x80x9d
Traditional approaches to fragmentation problems often introduce other problems into the system. For example, one traditional solution is to move existing partitions together so that the system free space is in one piece. However, this solution could be quite expensive since a significant amount of existing data may have to be moved to place all the partitions together. Moreover, the corresponding data may have to be locked during the move to prevent data inconsistencies from occurring. As a result, this solution could reduce or prevent the availability of data to users during the data move.
Load balancing is another function that should be addressed by the logical volume manager, since the manner in which data is distributed among disk drives may cause load balancing problems. A disk drive can usually service only one I/O request at a time. Requests received at a xe2x80x9cbusyxe2x80x9d disk drive are stored in a queue for later processing, usually in the order received. If one disk drive is accessed more than other disk drives, the queue for accessing data from the busier disk drive becomes longer, and accordingly, the wait also becomes longer. This may result in some disk drives being overloaded while others remain idle or lightly loaded.
Solutions have been proposed to solve this load balancing problem but with limited success. A heavily accessed logical volume may be striped over a number of disk drives to distribute the load. However, the number of partition concatenations to stripe across must typically be chosen when the logical volume is allocated. This requires knowing ahead of time that a set of data is going to be heavily accessed, and presumes that the access pattern will not change over time. Because of changing access patterns, it is usually very difficult to predict optimal striping patterns ahead of time.
Another solution is to gather statistics about the frequency in which different logical volumes are accessed, and then reallocate multiple logical volumes to put less frequently accessed logical volumes on the same physical disk drives as more heavily accessed logical volumes. Logical volumes may also be reallocated to be striped over more disk drives. Deciding how to reallocate, however, is usually a labor intensive administrative task with conventional systems. Once data has been stored, it is normally quite expensive to move that data around. The data is either made unavailable or significant overhead must be incurred to coordinate normal accesses with the movement of the data. In addition, changing the number of disk drives for striping normally requires recopying of the entire logical volume.
A disk drive can be added to a system to increase the amount of available storage. Typically, new data is stored in the new disk drive, rather than moving existing data to be stored in the new disk drive. It may be necessary in some circumstances to add disk drives to support more I/O operations rather than to just provide more storage. However, adding a disk drive for this purpose raises many of the same problems associated with load balancing. For example, when first added, a new disk drive is like a device that has been misconfigured to be idle and needs data from existing logical volumes to be moved to it.
The foregoing problems of the conventional systems are further exasperated by systems containing many disk drives (e.g., a thousand or more disk drives). This is due in large part to the amount of manual administration required in conventional systems. In conventional systems, the functions of configuring, addressing, and administering logical volumes and disk drives are normally performed manually by an administrator who must make choices as to the proper configuration to employ. When a large number of disk drives and/or logical volumes are used, this manual administration becomes more and more difficult. Thus, existing systems are prone to human error and their structures (administrative and data) do not scale well beyond a certain number of disk drives.
Thus, there is a need for a system and method to address the above described problems of the related art. There is a need for a logical volume manager which can efficiently and effectively address the problems inherent in the prior art with respect to load balancing, fragmentation, and incremental addition of disk drives, particularly in disk systems having a very large number of disk drives.
The invention is a system and method for managing and allocating logical volumes on a plurality of data storage devices. Some objects and advantages provided by the present invention include improved load balancing, reduction or elimination of fragmentation, and efficient incremental addition of disk drives.
Load balancing can be performed in parallel across available disk drives to prevent hot spots and maximize performance, even with rapidly changing data usage patterns. The present invention can be used to prevent fragmentation. The present invention performs automatic online disk drive space reorganization for the incremental addition or removal of storage capacity.
A feature of one embodiment of the invention is that each disk drive is divided into many small fixed size pieces. Each piece is small compared to the size of a logical volume or disk drive. In the contemplated normal operation of the invention, storage space on a disk drive is allocated and freed in units of the fixed size pieces. Fragmentation is reduced or eliminated because all pieces are the same size. Allocations of the pieces can be made along boundaries that correspond to the number of contiguous pieces being allocated. Multiple piece allocations are also reusable for identical allocations or smaller allocations.
Another feature is that each logical volume is made of pieces from many disk drives. In an embodiment, there would be pieces from every storage device, but this may not be possible if there are too many devices or if some are full. The pieces of a logical volume are spread out as evenly as is practical so that two pieces on the same disk drive are far apart in the address space of the logical volume. Thus I/O load is spread evenly over all disk drives. The address space of the logical volume can be striped across small groups of pieces to improve throughput for large I/O operations.
If a new disk drive is added, pieces from all logical volumes can be migrated to the new device. Since each piece is small it can be locked briefly while being copied with little impact on availability. Since each piece can be moved independently of any other pieces, there is no need to restripe entire logical volumes. The migration can be done gradually at low priority to limit the impact on the overall system performance. As the migration proceeds the new device gradually increases its contribution to the overall I/O load of the system. If a disk drive needs to be removed, its pieces can be gradually migrated to other disk drives.
Further details of aspects, objects, and advantages of the invention are described below in the detailed description, drawings, and claims.