Enterprise storage systems store data in large-scale environments and differ from consumer storage systems in both the size of the environment and the types of technologies that store and manage the data. A large-scale environment that stores data is typically referred to as a storage area network (SAN). SANs are commonly used in enterprise storage systems to transfer data between computer systems and storage devices. A typical SAN provides a communication infrastructure, including physical connections between computer systems, storage devices, and a management layer, which organizes the connections, storage devices, and computer systems.
In a SAN environment, computer systems, typically referred to as hosts, connect to the SAN via one or more host bus adapters. The SAN itself may include thousands of different inter-related logical and physical entities. In the case of a Fibre Channel SAN, these entities, which comprise the connections between hosts and storage devices may include Fibre Channel host bus adapters, Fibre Channel switches, Fibre Channel routers, and the like. The entities may be physically connected through the use of twisted-pair copper wire, optical fiber, or any other means of signal transmission.
Storage devices may include multiple disk drives that combine to form a disk array. A typical disk array includes a disk array controller, a cache, disk enclosures, and a power supply. Examples of disk arrays include the SYMMETRIX Integrated Cache Disk Array System and the CLARIION Disk Array System, both available from EMC Corporation of Hopkinton, Mass. A disk array controller is a piece of hardware that provides storage services to computer systems that access the disk array. The disk array controller may attach to a number of disk drives that are located in the disk enclosures. For example, the disk drives may be organized into RAID groups for efficient performance. RAID (redundant array of inexpensive disks) is a system that uses multiple disk drives that share or replicate data among the drives. Accordingly, in a RAID system, instead of identifying several different hard drives, an operating system will identify all of the disk drives as if they are a single disk drive.
Disk array controllers connect to a SAN via a port. A port serves as an interface between the disk array controller and other devices, such as the hosts, in the SAN. Each disk array controller typically includes two or more ports. Disk array controllers may communicate with other devices using various protocols, such as the SCSI (Small Computer System Interface) command protocol over a Fibre Channel link to the SAN. In the SCSI command protocol, each device is assigned a unique numerical identifier, which is referred to as a logical unit number (LUN). Further, communication using the SCSI protocol is said to occur between an “initiator” (e.g., a host) and a “target” (e.g., a storage device) via a path. For example, a path may include a host bus adapter, an associated SCSI bus or Fibre Channel cabling, and a single port of a disk array controller.
Management of a path is provided by a path management software. An example of the path management software is EMC POWERPATH system developed by EMC Corporation of Hopkinton, Mass. Path management software is a host-based software solution that is used to manage SANs and, among other things, can detect load imbalances for disk array controllers in a SAN and can select alternate paths through which to route data. In present systems, the path management software selects alternate paths after realizing that a first path has failed. Path failure may occur, for example, from the complete or partial failure of components within the SAN. However, in a SAN that may comprise thousands of entities, the path management software is unable to detect the root cause of the path failure. Thus, in selecting an alternate path, the path management software simply avoids all paths having a common end point as the failed path. However, there may be many paths with uncommon end points from the failed path which include the failed entity. Therefore, this method of alternate path selection is inefficient.
In view of the foregoing, what is needed is a system and method that is capable of utilizing the path configuration information of a path management software, the topology of a SAN, and an identified root cause failure of a path in the selection algorithm/method of an alternate path. The obtaining and coordination of path configuration information, topology information, and detected root cause path failure and its impact on data path may be centralized in one entity, such as a path impact analysis server.