The present invention relates generally to centralized storage devices, and more particularly to automated provisioning of centralized storage devices to be used in network computing applications.
Many modern day computer systems rely heavily on networking technology. As computer networks become more and more prevalent, a common practice of using centralized storage devices is gaining popularity. This technique ensures that all devices on the computer network have access to the same data, and allows for better maintenance and monitoring by a system administrator.
There are multiple configurations in which centralized data storage devices may be used in a computer networking environment. Some of the more commonly used configurations are implemented in what are known as storage networks. Storage networks are used to tie multiple hosts to a single storage system, and may be either a storage area network (SAN), or a network attached storage system (NAS). These two types of storage networks differ primarily in the manner in which the devices on the network are attached to the storage system. In an SAN configuration, the devices are attached to the centralized storage device by way of channels, or direct connections from the devices to the centralized storage device. In an NAS configuration, the devices of the computer network are attached to the centralized storage device by way of a network, or virtual, connection. Thus, storage devices in SANs are considered to be channel attached devices, while storage devices in NASs are known as network attached devices.
In storage networks, many different types of storage devices may be used. One common type of storage device, which may be used in a central location is a redundant array of independent disks (RAID). A RAID uses a controller and two or more disk drives to store data. RAID systems have different configuration levels, such as RAID0, RAID1, RAID2, RAID3, RAID4, RAID5, RAID6, RAID7, RAID10, and RAID53. Depending upon the specific configuration of the RAID device used for centralized data storage, various advantages may be obtained. Some of the advantages generally obtained through use of a RAID device include increased input/output (I/O) performance, increased fault tolerance, and data redundancy. The degree to which each of these advantages is obtained depends upon the specific RAID configuration. For more information regarding RAID technology in general, a general description of each of the RAID configurations can be found on the Internet at the following URL: http://www.raid5.com.
One of the elements involved in RAID storage is data striping. Data striping is a technique whereby data elements are broken into specific blocks and written to different disks within the disk array of the RAID storage device. This improves access time as the controller, which accesses each of the disks within the disk array, may spread the load of I/O requests across many channels and many disk drives. Additionally, data may be backed up using various data redundancy algorithms, such as parity storage algorithms, or the like.
One of the leading manufactures of RAID storage devices is EMC Corporation of Hopkinton, Mass. EMC Corporation manufactures a variety of RAID storage devices that may be used in storage networks such as SANs and NASs. Enterprise Storage Systems, or EMC devices manufactured by EMC Corporation, are generally highly scalable and provide high availability of information to network clients. While EMC devices are among some of the more commonly used RAID devices for a large storage network applications, other systems using RAID technology are also used in many storage network applications.
While the advantages of using RAID storage devices, such as EMC devices for example, are desirable, there are also some difficulties associated with using these types of storage devices. For example, because data is written in blocks to multiple disk drives, rather than a single disk drive, it is difficult to ascertain the location of a single file, as it is distributed among the various disk drives of the RAID storage device. Nonetheless, knowledge of where data files are stored on a central storage device is important, particularly for applications wherein the hosts which store their data on the same RAID storage device are not permitted to access each others data. This situation may occur, for example, in an Internet service hosting network environment, wherein a RAID storage device is used to centrally store data corresponding to multiple clients"" accounts. In such a situation, it would be crucial that client A have access to only client A""s data, and that no other clients, such as clients B and C have access to client A""s data, e.g., financial transaction information, such as credit card account information, or proprietary business operations information.
Another complicating factor involves the connective topology between the central storage device and the various hosts whose data reside on the central storage device. As shown in FIG. 1, typically a central storage device 8, e.g., an EMC frame, has a number 1-N of different ports (some of which are illustrated using reference numerals 10, 12, 14, 16) through which data can be passed via direct connections between hosts 1-N and the central storage device 8. The number of ports provided to the central storage device 8 may be very limited, e.g., twelve. Thus, using the direct attachment scheme depicted in FIG. 1 would only permit the central storage device to be connected to N hosts, where N is defined by the number of ports provided to the central storage device 8. This result will be inefficient, in an environment wherein the central storage device 8 is able to provide a greater data throughput to each port 10-16 than will be used by an individual host 1-N. Since the central storage device 8 is a very expensive piece of equipment, it is highly desirable to make more efficient use of its limited number of ports.
Accordingly, a switching matrix 18 can be introduced between the central storage device 8 and the hosts which store their data on the central storage device as shown in FIG. 2. The switching matrix 18 may, for example, be comprised of a number of fiber switches (not shown). By introducing the switching matrix 18 between the central storage device 8 and the hosts, a larger number of hosts 1-Y can share the limited number of ports 1-N available at the central storage device 8. This provides a more efficient usage of the capabilities of the central storage device 8, but at the expense of introducing additional complexity into the pathway between any given host and its data. This complexity makes it particularly challenging to accomplish the task of provisioning these types of central storage devices, i.e., to identify and allocate storage space to any of the hosts 1-Y on an ongoing basis.
As will be appreciated by the foregoing, it is difficult to model the data contained within such complex storage devices. Modeling the topology of the storage device and associated interfaces (e.g., switches and ports) would be useful to ensure that each client has access to only its data and that no client can access the data of another client. In providing a data model of the centralized storage device, it can be readily ascertained who has access to which data, and which ports should be connected to a specific portion of the centralized storage device in an SAN storage network, or which network address should have access to a particular portion of the storage device in an NAS storage network.
Therefore, it would be desirable to develop a data model of a centralized storage device to be used in a network, such as a storage network, to provide a better understanding of where data for particular clients is located on the storage device, and to aid in preventing access to a client""s data by anyone other than that client. Additionally, it would be desirable to provide a system and method for automatically provisioning a central storage device, thereby eliminating the possibility of human error in such provisioning, and thereby providing a greater assurance of the security of a client""s data contained on the central storage device.
In accordance with the present invention, these objectives are achieved by the provision of a data model for characterizing a storage device, a system and method for managing data storage device on a network, and a system and method for modeling data of a network storage device.
An exemplary data model for characterizing a data storage device connected to a computer network according to the present invention may include a plurality of related entities. Each entity is associated with one or more entities in a one-to-many relationship or a many-to-one relationship. Each entity is also characterized by a variable set, the values of which identify physical instances of each entity in the data model, e.g., a storage device or a switch. The variable sets also include primary keys which interrelate the variable sets of interconnected entities within the model.
A method for allocating storage to a host within a central data storage device having at least one switch disposed therebetween using a data model according to the present invention may, for example, include the steps of: (a) informing the central storage device that the host is authorized to access a predetermined storage area, the predetermined storage area being a subset of said unallocated storage space; (b) creating a path through the switch between the central data storage device and the host; and (c) informing the host that the predetermined storage area has been allocated thereto, wherein at least one of steps (a)-(c) is performed using information extracted from a data model associated with said the storage device.