The present disclosure relates generally to computer systems and, in particular, to simulating a failure in a virtualization environment.
A shared computer system often concurrently supports a number of different guest operating systems by using virtual machines. Virtual machines can be in the form of virtual machine guests, logical partitions (LPARs), or other isolation techniques.
Virtual machines (VM's) are separated in two major categories based on their use and degree of correspondence to any real machine. A system virtual machine provides a complete system platform which supports the execution of a complete operating system (OS). In contrast, a process virtual machine is designed to run a single program, which means that it supports a single process. An essential characteristic of a virtual machine is that the software running inside is limited to the resources and abstractions provided by the virtual machine—it cannot break out of its virtual world.
System virtual machines (sometimes called hardware virtual machines) allow multiplexing the underlying physical machine between different virtual machines, each running its own operating system. The software layer providing the virtualization is called a virtual machine monitor or hypervisor. A hypervisor can run on bare hardware (Type 1 or native VM) or on top of an operating system (Type 2 or hosted VM). The main advantages of system VMs are that multiple OS environments can co-exist on the same computer, in strong isolation from each other, and the virtual machine can provide an instruction set architecture (ISA) that is somewhat different from that of the real machine.
Multiple VMs each running their own operating system (called a guest operating system) are frequently used in server consolidation, where different services that used to run on individual machines in order to avoid interference, are instead run in separate VMs on the same physical machine. This use is frequently called quality-of-service isolation (QoS isolation). The desire to run multiple operating systems was the original motivation for virtual machines, as it allowed time-sharing a single computer between several single-tasking operating systems.
A shared computer system may also employ other containers executing discrete and unrelated tasks. In such a collaborative shared-physical-resource environment, testing and workloads can be disrupted in non-obvious ways during development on a shared computer system.
In some instances, two mainframe or other computers may be monitoring one another. If one mainframe determines that it or the other mainframe is about to crash, system designers have attempted to develop elegant load shifting techniques to ensure that processsing is not too adversely affected if one of the mainframes goes down. Like any development, these techniques have created additional challenges.