The rapid growth in data intensive applications continues to fuel the demand for raw data storage capacity. As a result, there is an ongoing need to add more storage, file servers, and storage services to an increasing number of users. To meet this growing demand, the concept of storage area network (SAN) was introduced. A SAN is defined as a network having the primary purpose of the transfer of data between computer systems (hosts) and storage devices. In a SAN environment, storage devices and servers are generally interconnected through various switches and appliances. This structure generally allows for any server on the SAN to communicate with any storage device and vice versa. It also provides alternative paths from a server to a storage device.
To increase the utilization of SANs, extend the scalability of storage devices, and increase the availability of data, the concept of storage virtualization has recently developed. Storage virtualization offers the ability to isolate a host from the diversity of storage devices. The result is a substantial reduction in support effort and end-user impact.
FIG. 1 shows a SAN 100 that enables storage virtualization operations and includes a virtualization switch 110. Virtualization switch 110 is connected to a plurality of hosts 120 through a network 130, for example a local area network (LAN), a metro area network (MAN), or a wide area network (WAN). The connections formed between the hosts and the virtualization switches can utilize any protocol including, but not limited to, Gigabit Ethernet carrying packets in accordance with the internet small computer systems interface (iSCSI) protocol, Infiniband protocol, and others. Virtualization switch 110 is further connected to a plurality of storage devices 140 through an interconnect interface 150, such as, SCSI, Fibre Channel (FC), Parallel SCSI (P.SCSI), iSCSI, Serial Attached SCSI (SAS), and others that would be readily apparent to those of skill in the art.
Virtualization of a SAN essentially means mapping of a virtual volume address space to an address space on one or more physical storage devices 140. Specifically, a virtual volume can be anywhere on one or more physical storage devices including, but not limited to, a disk, a tape, and a redundant array of independent disks (RAID), that are connected to a virtualization switch. Furthermore, a virtual volume consists of one or more logical units (LUs), each identified by a logical unit number (LUN). LUNs are frequently used in the iSCSI and FC protocols and are generally configured by a system administrator. Each LU, and hence each virtual volume, is generally comprised of one or more contiguous partitions of storage space on a one or more physical storage devices 140. Thus, a virtual volume may occupy a whole storage device 140, a part of a single storage device 140, or parts of multiple storage devices 140. The physical storage devices 140, the LUs and their exact locations, are transparent to the user. To execute the virtualization services virtualization switch 110 maintains a mapping scheme that defines relations between the virtual volumes, the LUs, and the physical storage devices. A virtual volume may be any kind of volume including, but is not limited to, a concatenation volume, a stripe volume, a mirror volume, a simple volume, a snapshot volume, or any combination thereof. A virtual volume may further include one or more volumes of the above types.
Virtual volumes are generally created by allocating storage space to LUs on physical storage devices 140. Generally, the allocation can be performed automatically or manually by a system administrator in order to achieve good performance, full control of the SAN resources, and optimal storage utilization. For example, the system administration may create virtual volumes to achieve low latency when accessing a virtual volume. For this purpose, LUs of a virtual volume are allocated on a low-latency disk or a disk array that is physically located close to the host, to accomplish minimal number of hops and minimal delay when reading or writing data. However, during normal operation, after many allocation and de-allocation of storage space, the virtual volumes end up fragmented with many data blocks of its respective LUs spread over storage devices 140 with small gaps between them. In addition, fragmented volumes may be generated by an automatic process that, on one hand, may reduce the configuration time and complexity, but on the other hand may create sub-optimal fragmented volumes.
To solve problems involved with fragmented volumes it would be therefore advantageous to provide a method that automatically defragments SANs.