Throughput is an important and crucial metric used to measure system performance in numerous areas of endeavor, such as banking, databases, and searching. Throughput is generally expressed in terms of number of operations or transactions performed per given time unit, such as queries per second. Optimizing throughput is important for several reasons. First, empirically, an average human user can perceive a response delay longer than three tenths of a second. Thus, throughput directly effects the ability of a server to minimize such human-perceivable delays.
Throughput also directly effects the ability of a server to keep pace with operation or transaction processing volume. For example, Web content search engines often process in excess of several thousand queries per second over several billion pages of Web content. This processing load exceeds the capabilities of most current monolithic computer system architectures. Consequently, search engines, as well as most other forms of operation and transaction processing systems, have trended towards including system components consisting of combinations of loosely- and tightly-coupled multiprocessing architectures, which offer higher overall processing capabilities and favorable scalability.
Nevertheless, although an effective alternative to monolithic architectures, multiprocessing architectures have limitations, which can often be alleviated through load balancing. For instance, multiprocessing overhead in an untuned system can potentially hinder throughput. Without effective load balancing, merely increasing the number of individual systems utilized within a multiprocessing architecture can fail to satisfactorily increase throughput due to the increased complexity required to coordinate and synchronize operation or transaction processing. Load balancing attempts to avoid overhead problems and works to distribute the processing load over each server for effective utilization.
Independent of system architecture, throughput can be affected by the nature of the operations or transactions performed during execution. For instance, comprehensively searching or evaluating as many available Web pages as possible is an important part of providing the highest quality search results for Web content search engines. Each Web page must be evaluated or referenced as part of a query execution. As a result, access to each Web page becomes crucial to avoid allowing query execution to become data-bound due to a bottleneck restricting access to the required Web page. The data bottleneck problem is pervasive throughout other areas of endeavor and effectively laying out data for access by multiple systems is a critical part of load balancing.
One conventional approach to load balancing distributes target files over a set of multiprocessing systems with one target file per system. This approach, though, can create data bottlenecks, which hinder throughput when multiple systems attempt to access the same file. As well, this approach only provides static load balancing that cannot be adjusted for current actual work load. Dynamic load balancing is possible by introducing a centralized work load manager, but latencies increase and the data bottleneck problem remains.
Another conventional approach to load balancing measures throughput on a file-by-file basis and attempts to normalize the number of files assigned to each system to thereby improve the average time per operation or transaction. However, this approach relies on the assumption that all operations or transactions require the same amount of processing time and fails to provide improved throughput when individual operations or transactions vary in terms of processing times and file accesses.
Accordingly, there is a need for providing an effective layout of files for use in processing operations in a multiprocessing architecture, whereby each operation requires access to at least one file. Preferably, one or more of the files are duplicated and distributed over multiple servers by specifying a layout arrangement.
There is a further need for providing effective scheduling of operation execution in a multiprocessing architecture. Preferably, those servers having a substantially minimal work load would be favored and outstanding operations would be tracked as an indication of actual overall system work load.