1. Field of the Invention
The invention relates to data storage computer systems. Specifically, the invention relates to apparatus, systems, and methods for facilitating management of logical nodes through a single management module.
2. Description of the Related Art
Computer and information technology continues to progress and grow in its capabilities and complexity. In particular, data storage systems continue to evolve to meet the increasing demands for reliability, availability, and serviceability of the physical data storage system and its hardware, software, and various other components. Data storage systems often handle mission critical data. Consequently, data storage systems are expected to remain on-line and available according to a 24/7 schedule. Furthermore, data storage systems are expected to handle power and service outages, hardware and software failures, and even routine system maintenance without significantly compromising the reliability and availability to handle data Input/Output (I/O) from hosts.
FIG. 1 illustrates a conventional data storage system 100. The system 100 includes one or more hosts 102 connected to a storage subsystem 104 by a network 106 such as a Storage Area Network (SAN) 106. The host 102 communicates data I/O to the storage subsystem 104. Hosts 102 are well known in the art and comprise any computer system configured to communicate data I/O to the storage subsystem 104.
One example of a storage subsystem 104 suitable for use with the present invention is an IBM Enterprise Storage Servers® available from International Business Machines Corporation (IBM) of Armonk, N.Y. To provide reliability, availability, and redundancy, the storage subsystem 104 includes a plurality of host adapters (not shown) that connect to the SAN 106 over separate channels. The host adapters 108 may support high speed communication protocols such as Fibre Channel. Of course, various other host adapters 108 may be used to support other protocols including, but not limited to, Internet Small Computer Interface (iSCSI), Fibre Channel over IP (FCIP), Enterprise Systems Connection (ESCON), InfiniBand, and Ethernet. The storage subsystem 104 stores and retrieves data using one or more mass storage devices 108 such as, but not limited to Direct Access Storage Devices, tape storage devices, and the like.
As hardware costs have gone down, data storage systems 100 have become more complex due to inclusion of redundant hardware and hardware subsystems. Often, the hardware components are highly susceptible to failure. Consequently, the storage subsystem 104 may include one or more processors, electronic memory devices, host adapters, and the like.
Typically, to make most productive use of the redundant hardware, the hardware is specifically allocated or shared between a plurality of logical nodes 110. A logical node 110 represents an allocation of the computing hardware resources of the storage subsystem 104 such that each logical node 110 is capable of executing an Operating System (OS) 112 independent of another logical node 110. In addition, each logical node 110 operates an independent set of applications 114. The logical nodes 110 appear as separate physical computing systems to the host 102.
A coordination module 116, also known as a Hypervisor (PHYP) 116, coordinates use of dedicated and shared hardware resources between two or more defined logical nodes 110. The PHYP 116 may be implemented in firmware on a dedicated processor. Typically, the logical nodes 110 share memory. The PHYP 116 may ensure that logical nodes 110 do not access inappropriate sections of memory.
Separating the storage subsystem 104 into a plurality of logical nodes 110 allows for higher reliability. If one logical node 110 crashes/fails due to a software or hardware problem, one or more other logical nodes 110 may be used to continue or restart the tasks that were being performed by the crashed logical node 110.
Management and control of the plurality of logical nodes 110 is a challenge. Any management, control, maintenance, monitoring, troubleshooting or service operation should be coordinated with the constant I/O processing so that the 24/7 availability of the storage subsystem 104 is not compromised. Typically, a management console 118 manages the storage subsystem 104 via control communications (referred to herein as “out-of-band communication”) separate from the I/O channels.
The storage subsystem 104 may include a network adapter, such as an Ethernet card, for out-of-band communications. The management console 118 may comprise a separate computer system such as a workstation executing a separate OS and set of management applications. The management console 118 allows an administrator to interface with the PHYP 116 to start (create), stop, and configure logical nodes 110.
Unfortunately, the management capabilities of the management console 118 are severely limited. In particular, the logical nodes 110 are completely independent and unrelated. Consequently, to manage a plurality of logical nodes 110 for example, to set a storage space quota, an administrator must login to each node 110 separately, make the change, and then log out. This process is very tedious and can lead to errors as the number of logical nodes 110 involved in the operation increases. Such management tasks are complicated by the fact that different OSes 112 and/or storage applications 114 may reside on each node 110. Consequently, administrators may have to use different command sets and different parameters for each node 110.
The repetitive nature of such a change is exacerbated in a storage subsystem 104 where nodes 110 may be highly uniform and may differ in configuration by something as minor as a name. Managing the nodes 110 separately may require significant time and expense. In addition, an administrator may be the only one who knows that two similar nodes 110 are to be similarly configured because there is no internal relationship between the nodes 110.
Furthermore, the management console 118 provides very few management commands. Typically, the management console 118 is limited to commands that start (create), stop, and configure logical nodes 110 themselves. The management console 118 fails to allow an administrator to send management commands to the OS 112 or applications 114 of one or more logical nodes 110. Instead, the administrator must login to each node, manually shutdown and applications 114 and then the OS 112. Then, the administrator can stop the node 110 in order to perform some maintenance operation. The management console 118 also fails to send management commands to more than one node 110 at a time regardless of whether two or more nodes 110 share a relationship.
The management console 118 conventionally only controls nodes of a single storage subsystem 104. To control multiple storage subsystems 104, which is common in modern enterprise systems, the administrator must login to each node 110 separately and may have to physically move to a different management console 118 machine to complete the management operations. The high number of nodes 110 that must each be individually managed limits the administrator's effectiveness. In addition, the independent nodes 110 make automated tools for management more difficult to implement and configure.
From the foregoing discussion, it should be apparent that a need exists for an apparatus, system, and method for facilitating management of logical nodes through a single management module. Beneficially, such an apparatus, system, and method would provide a common command set for management and control of disparate nodes 110 as well as the OSes 112 and applications 114 of the nodes 110. In addition, the apparatus, system, and method would support relationships between nodes 110 such that management commands sent to one node 110 are automatically implemented on all nodes sharing that relationship. Furthermore, the apparatus, system, and method would support management of a plurality of hardware platforms, such as for example storage subsystems 104, from a single management module. Each platform may include one or more logical nodes 110.