1. Field of the Invention
The present invention generally relates to networked computer and storage systems, and more particularly to a system and method for implementing storage virtualization.
2. Discussion of the Related Art
Storage virtualization refers to hiding or masking a physical storage device from a host server or application. In this regard, the relationship of virtual storage to a network is conceptually similar to the relationship of virtual memory to a single system.
With virtual memory management, complex applications become easy to implement. The same is realized in virtual storage management, except that rewards are potentially greater in storage virtualization. Solutions to storage management problems have high business value. Virtualization of persistent storage enables storage management functions great flexibility in distributing data throughout a network of diverse storage devices and reconfiguring network storage as needs change.
Some of the benefits of storage virtualization include the ability to: isolate applications from underlying physical devices, improve availability and maintainability of systems, expand storage on the fly, reduce downtime for backup and other maintenance functions, migrate data from systems and applications, support large, high performance storage applications, mix and match various storage devices for investment protection, support advanced storage applications such as replication, and support larger storage devices than are physically possible.
As is known, iSCSI refers to a protocol, which is a mapping of the SCSI remote procedure invocation model on top of the TCP protocol. Various ways have been proposed to accomplish virtualization in iSCSI, including: (1) implementing virtualization in the host, in a layer above the iSCSI; (2) through an external server that interacts with the physical storage nodes; (3) through third party add-ons or proprietary protocols; and (4) defining virtualization in the iSCSI protocol.
Implementing virtualization layer in the host above the iSCSI is believed to gain some benefits in a single host environments. However, in multi-host networks it may suffer from coherency problems. An external server that manages the storage nodes can implement virtualization in the current iSCSI framework. However, this solution has disadvantages such as data being transferred twice in each transaction (from node to server and from server to host), and the server thus becomes a bottleneck. Third-party protocols or proprietary protocols can implement virtualization. However, to gain interoperability with an iSCSI protocol, it is preferable to use a standard protocol.
Storage virtualization through an iSCSI protocol has also been proposed. In this regard, reference is made to FIG. 1, which illustrates certain systems element in a hypothetical virtual storage system. The illustrated system includes two hosts 10 and 12, two managers 20 and 30, and two sets of stores A1-An 22 and 24, and B1-Bn 32 and 34. The various components communicate across a network 40. The illustrated system includes two storage groups: one defined by manager A 20 and stores A1-An 22 and 24, and another defined by manager B 30 and stores B1-Bn 32 and 34.
As is known, an iSCSI store is a physical storage element (e.g., disk, gateway to disks, etc.) that attaches to the network 40 with an iSCSI protocol. Such a store has linear space and is defined by a store identifier (which provides a unique identifier to the store), metadata (which describes properties of the store), a class of service (which specifies availabilities, cost, performance and security), and a storage group (which is a collection of one or more stores).
The storage manager is a software entity, attached to the network 40 and provides data access and management control to one or more storage groups. The connection/communication among the elements in the system is via the iSCSI protocol. The elements in the system have the following interfaces to each other: The host has an (iSCSI) initiator interface to the manager and to the stores. The manager has a target interface to the host and an initiator interface to the stores. The stores have target interfaces toward the manager and the host.
The manager interface includes SCSI commands and data. As is known, the iSCSI protocol encapsulates SCSI commands and responses from initiator to the target and vice versa. A host initiates SCSI commands only to the manager, and the manager replies with iSCSI status message response that includes header and attached data. The attached data contains iSCSI commands and stores that the host issues. At the end of each phase, the store sends the status message to the host and the manager.
Reference is made briefly to FIG. 2, which illustrates the messages' flow in case of SCSI read command. As illustrated, in a system with a host 50, a manager 60, and one or more disks (e.g., disk A, disk B, etc.) 70, the host 50 may initiate the process with a SCSI command, to which the manager 60 responds with a SCSI status message, as well as commands for the various disks. The host then, individually communicates these commands to the disks. After communicating each command, each disk will provide SCSI data (if appropriate), and a SCSI status or reply message).
By way of a more definitive example, consider a system having virtual group A that is constructed from a manager and three stores: Disk A1, Disk A2 and Disk A3. Assume further that each store contains 1000 blocks. Thus, the virtual group reflected to the host contains 3000 blocks. Assume, for purposes of this example, that the virtual address space spanning addresses 500 and 600 is physically distributed as illustrated in FIG. 3A (e.g., virtual addresses 500-509 are physically located on Disk A1 from 100-109, virtual addresses 510-519 are physically located on Disk A2 from 200-209, etc.). Assume further that a host (initiator) 50 desires to read virtual addresses 500-600.
Although not specifically illustrated, the first phase of the process is the login phase. At this point, the host 50 is only aware of the manager. Thus, the host initiates the login process by sending an iSCSI login request to the manager 60, as if the manager 60 was a simple target. The host 50 and manager 60 establish a new session (negotiating parameters, authenticating each other, etc.). If the login phase ends successfully, the manager 60 sends a login response message with “login accept” as a status. This message has an attachment in the data part that includes the list of stores in the group and their IP address.
This ends the login phase between the host and the manager. Thereafter, the host initiates a login session with each store in the group to establish separate sessions with each. Once a session has been established between the host (initiator) and each of the stores (targets), then SCSI commands (between the host and manager) may be carried out.
In keeping with the example in which the host 50 wishes to read 100 memory blocks (or logical units) from the virtual volume spanning virtual addresses 500-600, then the host must send individual commands to each of the respective stores 74, 76, etc. A portion of this process is illustrated in FIG. 3B. In this regard, the host 50 first sends a SCSI read command to the manager 60, informing the manager that the host wishes to read 100 blocks (or logical units) beginning at virtual address 500. The manager replies to the host by informing the host of the physical address of each of the desired blocks (or logical units). Thereafter, individualized SCSI read commands are sent to the respective stores 74, 76, etc. to read these blocks.
A first such command is sent to disk A174, requesting to read the 10 blocks beginning at address 100. Then, the data is send from disk A174 to the host 50. Then, the disk A174 sends a SCSI status to the host 50. Similarly, the host 50 then reads the next blocks, which are stored on disk A276. It does this by sending a SCSI read command to disk A276, requesting to read the 10 blocks beginning at address 200. Then, the data is sent from disk A276 to the host 50. Then, the disk A276 sends a SCSI status to the host 50. This process is continued until the entire 100 blocks of data have been sent to the host 50.
It has been found that this approach results in various inefficiencies. For example, the input/output (I/O) load (interrupts, reads, writes) on the host increases exponentially wit the number of managers the host is interfacing with. The I/O load also increases with the number of stores that each manager is virtualizing. Further, each manager virtualizing the storage operates more like a look table of storage devices than being a true virtualizing entity.