When a host executes a typical computer program (also known as an application), the application starts one or more computing tasks. At times, it is desirable to migrate such computing tasks to another host for any one of a number of reasons. One such reason is that the source host (otherwise known as a first host, i.e., a host which will become the source of the migration) may currently be overburdened with too many computing tasks. Another such reason is that the source host may be overburdened with even just a few computing tasks that consume substantial resources. Yet another such reason is that it may be desirable to shut down the source host, either for maintenance or because it is only lightly used at the moment—a result of which is that power can be saved by consolidating the current computing workload on fewer hosts.
Blade servers are examples of systems in which a number of servers (also known as blades or hosts) share resources including disk storage systems, network and input/output (IO) access, power, and cooling. The processors and main memory within each blade may be largely or totally interchangeable with those on the other blades. Blade servers are currently popular, due to their cost effectiveness for a variety of computing tasks applied to many types of applications. The current popularity of blade servers is only one of the reasons to provide effective mechanisms to migrate computing tasks among compatible hosts.
A manual approach for migrating a computing task includes: i) the user of the program stopping its execution on the source host; ii) the user or the program automatically saving the current program execution state to one or more files on a disk shared by both the source host and the destination host; and iii) starting execution of the program on the destination host. One drawback of this approach is that each migration requires manual intervention. Another drawback is that not all programs include features that allow the user to stop execution and save enough information on the state of the program to disk files.
Automatic approaches for migrating a computing task include, but are not limited to, virtualization. Virtualization has become popular for a variety of reasons. One common use of virtualization is for host consolidation. Virtualization allows underutilized hosts (i.e., physical machines) to be consolidated onto a single host (i.e. a single physical machine).
In a typical virtualization scheme, a particular instance of an operating system and all of the applications that operating system executes form a virtual machine (VM). Thus, computing tasks may be encapsulated as part of a VM. A single host may execute multiple VMs. Typically, each VM is unaware of the other VMs on the same host, each VM usually has no access to information about other VMs on the same host, and no VM can affect the operation of any other VM on the same host.
A VM can be migrated from a source host to a destination host automatically. While a VM may not need to be halted to migrate, its performance may be reduced for a period of time during which the migration is in process. Further, the performance of other computing tasks on both the source host and the destination host may be adversely impacted, particularly in the case where the decision was made to migrate the VM because the source host currently has a high computing workload. Migrating a VM can require half a minute to several minutes to complete. See M. Nelson, B-H Lim, and G. Hutchins, “Fast Transparent Migration for Virtual Machines,” Proceedings of USENIX '05 (General Track) Apr. 10-15, 2005.
FIG. 1 is a plot illustrating how computing workload changes over time, as known in the prior art, for a source host with computing tasks that can be migrated to another host. The computing workload is represented on the Y axis and ranges from 0% to 100% of the computing capacity of the source host. Time is represented on the X axis. Time ranges through a 24 hour period, that is, midnight to midnight.
Plot line 110 shows the computing workload on the source host assuming that no computing tasks are migrated away from the source host. As plot line 110 shows, there is essentially no computing workload on this host for the first few hours of the day. Starting around mid-morning the computing workload exceeds 80%. By late morning, the workload has maxed out around 100%. In the later part of the afternoon, the workload declines to below 80% and in the late evening it declines to below 15%.
Plot line 120 shows the computing workload on the source host assuming that a number of computing tasks are migrated away from the source host to a destination host when the computing workload exceeds a threshold of 80%. This migration can be done using VM technology.
Plot line 120 assumes that the source host initiates computing task migration as soon as its measured computing workload exceeds 80%. The migration process forces the computing workload on the source host to go even higher due to the resources the migration consumes on the source host. That is, there is a period of time during which peak 130a of plot line 120 is higher than plot line 110.
When a first set of computing tasks has been migrated away from the source host then plot line 120 lowers, peak 130a ends, and plot line 120 forms trough 130b. However, in trough 130b, the computing workload measured on the source host is still above 80%. Thus, the source host decides to migrate a second set of computing tasks.
This second migration results in peak 130c in plot line 120. When the second set of computing tasks has been migrated away from the source host, then plot line 120 again lowers, peak 130c ends, and plot line 120 forms plateau 130d. In plateau 130d, the measured workload is below 80%, and thus no further migrations are needed.
FIG. 1 illustrates a limitation of the above mentioned prior art approaches. There may be a period of time during which the computing workload is above the maximum utilization target of 80%. For example as shown in FIG. 1, plot line 120 exceeds 80% during peak 130a, trough 130b, and peak 130c. 
Another limitation of the above mentioned prior art approaches is that they may trigger unnecessary migrations. An actual computing workload is very unlikely to be a smooth curve, as shown by plot lines 110 and 120. Rather an actual computing workload would likely include a jagged random offset to the plot lines as shown. Because of this, a temporary spike in computing workload may trigger a migration of computing tasks; however, if the spike is short enough more computing workload may be consumed by the migration process executing on the source host and the destination host than would be consumed by the temporary spike itself.