1. Field of the Invention
The present invention generally relates to disk units for storage pools. More particularly, the present invention relates to configuration and accessibility of disk units for switchable storage pools.
2. Description of the Related Art
In this new era of electronic commerce, server/systems must be continuously available to the thousands of unknown and unforgiving Internet users. Even short periods of server/system unavailability give potential customers the excuse to point and click elsewhere. In the past, most disaster recovery focused on unscheduled downtime due to, for example, power outages, natural disasters, site disasters, system hardware or software errors, application malfunctions and deliberate acts of sabotage. Typically, the solution to unscheduled downtime is to stop the business and utilize backup systems from a remote recovery site. The business interruption may be many hours or even days.
The emerging requirement in electronic commerce businesses today is continuous system availability and protection from scheduled downtimes. Scheduled downtimes are becoming more problematic than the remote chance of a disaster. During a scheduled downtime or outage, the system (e.g., server) is deliberately made unavailable to users (e.g., client). Examples of scheduled downtime/outages include, installation of new operating system or application software releases, system hardware upgrades, additions, removals, and maintenance, system backups or saves, site maintenance, and application of program temporary fixes (PTFs). A system that is set to have xe2x80x9ccontinuous availabilityxe2x80x9d is defined as a system having no scheduled or unscheduled outages.
One method for improving and enhancing system availability utilizes a clustered system. A cluster is a collection of complete systems that cooperate and interoperate to provide a single, unified computing capability. A clustered system provides failover and switchover capabilities for systems that are used as database servers or application servers. If a system outage or a site loss occurs, the functions that are provided on a clustered primary server system can be switched over (or failed over) to one or more designated backup systems that contain a current copy (replica) of the resources. The failover can be automatic for unscheduled outages. For scheduled outages, a switchover may be scheduled with the scheduled outage or manually initiated.
In the event of a failover or a switchover, Cluster Resource Services (CRS), which may be part of the server operating system and running on all systems, provides a switchover from the primary system to the backup system. This switchover causes minimal impact to the end user or applications that are running on a server system. Data requests are automatically rerouted to the backup (i.e., new primary) system. Cluster Resource Services also provides the means to automatically re-introduce or rejoin systems to the cluster, and restore the operational capabilities of the rejoined systems.
Data may be stored in disk pools connected to one or more server systems. A disk pool is a set of disk units, such as a tower of disk units and a redundant array of independent disks (RAID). A disk pool is switched from a primary system to a backup system by switching ownership of the hardware entity containing the disk units of the disk pool from the primary system to the backup system. However, the disk units in the disk pool must be physically located in correct hardware entities (e.g., a tower which the primary and backup systems can access), and must follow many configuration and hardware placement rules. A user must follow these configuration and hardware placement rules when selecting disk units for the disk pool and when selecting primary and backup systems for accessing the disk pool. Otherwise, the disk pool may not be available for the primary system and/or the backup system when a switchover is attempted or when a failover occurs. The user must also follow these rules when changing the hardware configuration. The user has the responsibility to understand and follow the configuration and hardware placement rules to correctly configure the disk units and the cluster system. However, due to the complexity of the configuration and hardware placement rules, the user may be forced into a trial and error situation, resulting in unavailable disk units when a switchover occurs.
Therefore, there is a need for a system and method for ensuring that a set of disks (i.e., a disk pool) are accessible to a primary system and one or more backup systems for the disk pool. Furthermore, there is a need for ensuring that valid disk units are selected for configuration in a disk pool.
Embodiments of the invention generally provide methods and apparatuses for ensuring that a set of disks (i.e., a disk pool) are accessible to a primary system and one or more backup systems for the disk pool. Also, embodiments of the invention provide methods and apparatuses for ensuring that valid disk units are selected for configuration in a disk pool.
One embodiment provides a method for ensuring accessibility of one or more disk units by a system, comprising: configuring a storage pool for the system; validating availability of the one or more disk units for the storage pool; and selecting one or more valid disk units for the storage pool. The method may further comprise ranking availability of each disk unit for the storage pool and selecting one or more valid disk units for the storage pool according to availability ranking. Another embodiment provides a signal bearing medium, comprising a program which, when executed by a processor, implements the foregoing method.
In another embodiment, the method further comprises validating accessibility of disk units in the storage pool when adding a node to a clustered system. In yet another embodiment, when adding a switchable storage pool to the clustered system, the method further comprises verifying accessibility of each disk units in a switchable storage pool by one or more nodes in the clustered system. In yet another embodiment, the method further comprises verifying that a switchable entity containing the switchable storage pool is not included in another clustered system. In yet another embodiment, the method further comprises validating switchability of the switchable storage pool when starting clustering.
Yet another embodiment provides a system, comprising: a primary system; a storage pool connected to the primary system; and a processor configured to validate availability of one or more disk units for the storage pool and select one or more valid disk units for the storage pool. The processor may be further configured to rank availability of each disk unit for the storage pool and select one or more valid disk units for the storage pool according to availability ranking. The system may be a clustered system, and the storage pool may be a switchable storage pool.