1. Field of the Invention
This invention relates to storage management and, more particularly, to verification of data and storage resources in a multi-server networked environment.
2. Description of the Related Art
Enterprise computing environments are increasingly using configurations such as clustering, Storage Area Networks (SANs), and other centralized storage mechanisms to simplify storage, improve availability, and handle escalating demands for data and applications. Clustering may be defined as the use of multiple computers (e.g., PCs or UNIX workstations), multiple storage devices, and redundant interconnections to form what appears to external users as a single and highly available system. Clustering may be used for load balancing and parallel processing as well as for high availability.
The storage area network (SAN) model places storage on its own dedicated network. This dedicated network most commonly uses Fibre Channel or IP technology as a versatile, high-speed transport. The SAN includes one or more hosts that provide a point of interface with LAN users, as well as (in the case of large SANs) one or more fabric switches, SAN hubs, and other devices to accommodate a large number of storage devices. The hardware (e.g. fabric switches, hubs, bridges, routers, cables, etc.) that connects workstations and servers to storage devices in a SAN is referred to as a “fabric.” The SAN fabric may enable server-to-storage device connectivity to a wide range of servers and storage devices.
Of course, all storage approaches, including cluster and SAN-based solutions, are susceptible to failure. When a host crashes, minimizing downtime is critical. Although failover techniques may permit a second host to assume the duties of the crashed host, the storage resources previously associated with the crashed host may not be immediately available due to the need for verification and/or recovery. Using existing approaches, large storage resources may demand unacceptably large times for verification and recovery operations. In some cases, for example, an entire file system may be offline for hours while undergoing a full file system consistency check (“fsck”) operation, and the host performing the operation may likewise be limited in its availability or capabilities. It is therefore desirable to improve the performance and availability of storage resources and servers associated with verification and/or recovery operations.