Load balancing is generally a computer networking methodology to distribute workload across multiple computing devices or a computer cluster, network links, central processing units, disk drives, or other resources, to ideally achieve optimal resource utilization, maximize throughput, minimize response time, and avoid overload. Using multiple components with load balancing, instead of a single component, often increases reliability through redundancy. A load balancing service is usually provided by dedicated software or hardware, such as, for example, a multilayer switch or a Domain Name System server, etc.
Often when provisioning or assigning resources for new services or workloads within a cloud infrastructure or network system, the management software needs to ensure there is enough capacity (e.g., processing capacity, memory capacity, storage capacity, bandwidth capacity, etc.), and find a sufficiently suitable place or computing device for new workloads.
Generally, network engineers are concerned with questions regarding how to take into account the capacity of system resources (e.g., processing capacity, memory capacity, etc.) in the “target hosts” or target computing devices in the infrastructure. Often one must determine an estimate whether there is enough capacity in the various given target computing devices, and to locate the best placement for candidate workloads. Often a “placement engine” cannot provide accurate advice because of, for example, an uncertainty in the actual current performance of the target hosts in a dynamic environment where many workloads are being added and removed, and the demand for system resources often changes depending on the business or network activity. Frequently, performance data is collected about the hosts or target computing devices, but there is a time lag between the last data collection and the time at which the placement advice is needed. In this time period, numerous workloads may already have been provisioned or assigned in a dynamic cloud environment, causing the actual performance of the target hosts to change. These and other difficulties make the proper placement or assignment of a new workload to a target computing device difficult.