1. Field of the Invention
This invention relates to Storage Area Network (SAN) management and more particularly relates to determining a set of SAN components for performance monitoring.
2. Description of the Related Art
Storage performance management and analysis has led to significant improvements in computer hardware, hardware controllers, and software. Storage performance management and analysis facilitates identification of data I/O bottlenecks and points of underutilization. In particular, monitoring and analyzing hardware devices, firmware, and hardware control software has led to great technological advances. One such advance is the design, standardization, and adoption of a Storage Area Network (SAN).
SANs are often used in large organizations such as enterprise environments having many servers and many storage devices. A SAN is an inter-networked set of hardware devices that enables storage devices such as disk drives, tape drives, optical drives and the like to exchange storage data with end-user applications and/or servers dedicated to storing and retrieving data. A typical SAN may include a complex network of Host Bus Adapters, Ports, a maze of Switches (often connected via InterSwitch Links (ISLs)), Virtualization solutions, Storage Subsystem Ports, and Storage Subsystem Volumes between the application that originates the data and the hardware storage device actually storing the data.
Storage data travels from an application on one end of the SAN to a storage device on the other end of the SAN along a data storage path. Typically, due to the complexity of the SAN, the data storage path varies with each I/0 as the data is routed across the SAN. Along the data storage path a variety of SAN components and parameters can affect how efficiently and successfully the storage data travels through the SAN. Monitoring the performance of the SAN as a whole permits actions to be taken to avoid bottlenecks of storage data or underutilization of SAN components. Such monitoring can not be performed at the application level because there is no single application that controls all the entry and exit points to the SAN. Consequently, the SAN components are monitored at the firmware, communication port, and hardware device levels.
Unfortunately, monitoring SAN components at such a low level becomes difficult due to the high number of SAN components and the high volumes of storage monitoring data that are generated by these SAN components. All of the monitoring data that is collected is stored such that analysis and troubleshooting queries can be performed. Unfortunately, the vast majority of the monitoring data collected may not even be relevant to a particular bottleneck or storage management problem being researched. In addition, SAN component monitoring should be performed in a manner that creates minimal interference with storage I/O traveling through the SAN. The more SAN storage performance data collected, the higher the impact of the performance monitoring on overall SAN performance. Finally, SAN component performance monitoring typically requires monitoring for a period of days so that error conditions can be detected as problems requiring action rather than anomalies.
Even if all the monitoring data produced by monitoring all the SAN components in a typical SAN could be collected and stored, analyzing such high volumes of data is difficult. The complexity of the SAN and its constituent components makes it difficult for analysts to determine cause and effect relationships such that action can be taken to remedy a problem. Part of the difficulty comes in distinguishing normal performance data from abnormal performance data. Often monitoring thresholds are set and crossed while the context of the operation indicates that the activity crossing the threshold is normal. Monitoring at such a low level often means that contextual information relating to a monitored event is lost. This further complicates the performance monitoring task on a SAN.
Therefore, operators and managers of the SAN must be selective in determining which SAN components to monitor. Consequently, the results of such manually defined SAN component monitoring are suspect because of a concern that some SAN component that played a role in the performance results were missed in the definition of the set of SAN components to be monitored.
In addition, SAN configurations are typically very dynamic. Hardware and software devices that are connected to or members of the SAN may change from day to day very rapidly. Such a dynamic environment requires that manual definition of a set of SAN components for monitoring must be constantly updated. Even storage management systems that automate detection of SAN components, suffer from the lack of an ability to adequately collect enough data, from enough SAN components, for a sufficient time period, to make analysis and problem resolution feasible.
From the foregoing discussion, it should be apparent that a need exists for an apparatus, system, and method that dynamically determines a set of storage area network components to be included in storage performance monitoring. Beneficially, such an apparatus, system, and method would dynamically adjust the members of the set of SAN components being monitored and/or the monitoring attributes associated with each SAN component in the set. Such an apparatus, system, and method would determine based on historical monitoring information which members of the set merit a closer analysis to identify problem areas.