When processing large amounts of data, business applications may process the data in batch jobs. The business applications can, for example, implement batch processing by grouping a sequence of commands to be executed within a single file or unit and executing the commands at a particular time within a given time period. Examples of batch processing include print jobs for accounting or processing pay slips at the end of a month. Batch jobs can be time-critical and may need to be finished in short time frames. Accordingly, the batch jobs may need to be implemented efficiently across multiple data processors, or work processes. For example, multiple computer processors or multiple servers can be used in parallel to efficiently execute one or more processing jobs. In some instances, a load balancing mechanism can be used to distribute the workload of the processing jobs evenly across the different processors.
Load balancing can be implemented statically at a centralized load balancer, which can divide the total workload into smaller work packages of fixed size before assigning the work packages to the available work processes. The static distribution of workload, however, may result in inefficient processing and utilization of hardware. For example, work packages of the same size do not necessarily have the same computational complexity. A centralized load balancer may not recognize the varying degrees of complexity within an overall workload or the computational resources required to process a particular work package. For example, a load balancer may be used to distribute processing of invoices of a large number of a customers. The total number of invoices may be divided by the load balancer into smaller, equally-sized packages of invoices for processing. Each individual invoice, however, may contain a different number of items and may require different processing times. Accordingly, the load balancer may divide the invoices into work packages that could potentially result in certain work packages requiring significantly more computational resources than others. When work packages of varying complexity are processed by several work processes, the utilization of the work processes may be inefficient when some work processes complete their tasks sooner than others.
In some implementations, load balancers can account for varying complexities in work packages by analyzing the workload and estimating the processing requirements for each work package. The load balancers can then distribute work packages of different sizes to compensate for the differences in complexity of the work packages. Load balancers that analyze a workload, however, may require additional processing time and may become a bottleneck in a batch process. Further, load balancers may require specific knowledge of a particular application or domain in order to estimate the complexity of individual work items.