Servers are computers that provide services to other computers. At a minimum, a server includes a data processor, memory, and input/output (I/O) devices. However, to respond effectively with services in demand, some severs include many instances of each. For convenient upgrading and repair, the multiple processors can be arranged in cells, e.g., physical modules with their own processor(s), memory, and I/O devices. To reduce downtime due to repair or upgrading, the cells can be grouped into hardware partitions, each of which can be repaired or upgraded without disturbing the functioning of the other hardware partitions.
To minimize downtime when a partition is repaired or upgraded, hardware partitions can be arranged in clusters, so that the functions of a partition being repaired or replaced can be transferred to another partition in the cluster, at least until the repair or upgrade is complete. Clusters can include partitions of different servers so that even if one server fails completely, a partition's functions can be continued by other partitions in the cluster. Hardware partitions can also be arranged in resource domains, which permit computing resources to be shifted among included hardware partitions.
Hardware partitions can be further divided into virtual partitions, each of which runs software independently of other virtual partitions in a hardware partition. However, the hardware on which that software runs cannot be changed without affecting the software on the virtual partition. Hardware and virtual partitions can be further divided into resource partitions. For example, one application program may be assigned to a first resource partition that is allotted 60% of the computing resources assigned to the incorporating virtual partition, while another application or process may be assigned to a second resource partition that is allocated 40% of the computing resources available to the incorporating virtual partitions. There are also virtual machines, which are basically software that allows one operating system to run on top of another operating system (instead of directly on hardware).
Each server entity, e.g., hard partition, virtual partition, cluster, etc., corresponds to a “technology” for which there is a dedicated management tool, e.g., a hard partition management tool, a virtual partition management tool, a cluster management tool, etc. Administrators dealing with complex systems employing multiple technologies often go beyond these tools to see how the technologies interact. For example, some administrators employ spreadsheets or other form of documentation to piece together the disparate technology elements of their system.
For an example of the peer-to-peer aspect of coordinating technologies, consider Serviceguard (available from Hewlett-Packard Company), which allows multiple servers to be joined together in a cluster to provide highly available applications. In this case, two hardware partitions can be joined in a cluster; for applications to be highly available, the partitions should be on different servers, preferably located remotely of each other so that it is not likely that a single cause of failure (e.g., a data-center power or cooling failure) would shut both partitions down.