In an organization there are incidents recorded where a device or a service becomes unresponsive, inhibited, unreachable, and so on, which can hamper the operations of the organization. In most of the cases, actions such as root cause determination, rebooting the computer system, and/or collecting system dump data is taken after an incident has occurred.
In a storage area network (SAN), a storage resource management (SRM) program collects data from various devices and/or component of a computer system. The data collected includes current and historical performance metrics and device details. The SRM program raises alerts based on the threshold values set for a particular metric for a particular device. The alerts are raised once the actual value of the metric for the particular device or system component meets the threshold condition set by a user.