I. Technical Field
The present invention generally relates to the field of enterprise path management. More particularly, the invention relates to systems and methods for disabling and/or enabling one or more devices in a storage area network.
II. Background Information
Enterprise storage systems store data in large-scale environments and differ from consumer storage systems in both the size of the environment and the types of technologies that store and manage the data. A large-scale environment that stores data is typically referred to as a storage area network (SAN). SANs are commonly used in enterprise storage systems to transfer data between computer systems and storage devices. A typical SAN provides a communication infrastructure, including physical connections between devices, and a management layer, which organizes the connections, storage devices, and computer systems.
In a SAN environment, one or more servers provide services to other systems (e.g., clients) over the network. Servers in a SAN environment are typically referred to as hosts. Furthermore, each host connects to the SAN via one or more host bus adapters. A host bus adapter controls the transfer of data between the host and one or more target storage devices. In the case of a Fibre Channel SAN, the hosts may use special Fibre Channel host bus adapters and optical fiber for connections between devices.
SANs are frequently used in enterprise storage. A typical Fibre Channel SAN, for example, includes a number of Fibre Channel switches that are connected together to form a fabric or a network. An enterprise storage system may further include multiple disk drives that combine to form a disk array. A typical disk array includes a disk array controller, a cache, disk enclosures, and a power supply. Examples of disk arrays include the SYMMETRIX Integrated Cache Disk Array System the CLARIION Disk Array System, both available from EMC Corporation of Hopkinton, Mass. The disk array controller is a piece of hardware that provides storage services to computer systems that access the disk array and may attach to a number of disk drives that are located in the disk enclosures. For example, the disk drives may be organized into RAID groups for efficient performance. RAID (redundant array of inexpensive disks) is a system that uses multiple disk drives that share or replicate data among the drives. Accordingly, in a RAID system, instead of identifying several different hard drives, an operating system will identify all of the disk drives as if they are a single disk drive.
Furthermore, disk array controllers connect to a SAN via a port. A port serves as an interface between the disk array controller and other devices in the SAN. Each disk array controller typically includes two or more ports. Disk array controllers may communicate with other devices using various protocols, such as the SCSI (Small Computer System Interface) command protocol over a Fibre Channel link to the SAN. In the SCSI command protocol, each device is assigned a unique numerical identifier, which is referred to as a logical unit number (LUN). Communication using the SCSI protocol is said to occur between an “initiator” (e.g., a host) and a “target” (e.g., a disk drive) via a path. For example, a path may include a host bus adapter, an associated SCSI bus or Fibre Channel cabling providing a physical link, and a single port of a disk array controller. In a fully-redundant SAN, an alternate path is available for every level of device outages. Path management software is frequently used to manage SANs and, among other things, can detect load imbalances for disk array controllers in a SAN and can select alternate paths through which to route data. An example of path management software is EMC POWERPATH by EMC Corporation of Hopkinton, Mass.
As is evident from the foregoing discussion, a SAN environment may include a variety of devices, such as disk arrays including disk array controllers, switches, ports, host bus adapters, and physical links between the devices. When a device needs maintenance (e.g., repair or replacement), the device is taken offline. Taking a device in a SAN, such as a disk array controller or port offline, will cause input/output errors and path failures across a SAN for any hosts that access logical units through the offline device. These errors are often difficult to diagnose and may cause unnecessary corrective actions to occur. It is instead preferred to take the device offline with respect to an enterprise before the device is placed in an offline state. By taking a device offline with respect to an enterprise, such errors are avoided because data is rerouted to avoid the offline device. However, typical path management software does not provide functionality for disabling devices with respect to an entire enterprise.
In addition, devices in a SAN may fail unexpectedly, causing disruptions throughout an enterprise that are otherwise avoidable if, upon the failure, path management software could disable the failed device. As with planned outages, an unplanned device outage will cause input/output errors and path failures across a SAN for any hosts that access logical units through the failed device. It is instead preferred to detect the device failure and take the device offline with respect to an enterprise. Typical path management software, however, does not provide such functionality.
As is evident from the foregoing discussion, conventional path management software and techniques are limited and suffer from several drawbacks. Therefore, there is a need for improved systems and methods for managing paths in a SAN due to planned or unplanned device outages.