Computing devices are commonplace in a growing number of fields. For example, computing devices are now generally employed to provide services in office environments, research, engineering and development environments, scientific environments and private environments. Services range from simple data handling operations to most complex scientific, research, engineering and development or other applications requiring extensive computing resources.
With increasing computing capabilities more complex applications may be handled by a single computing device. However, within limits, applications or programs may become so complex that it is not economical or possible within a reasonable time frame to handle the amount of data or computations required to solve a computing problem with a single computing device. A straight forward solution to handle evermore complex applications is to construct larger computing devices, for example with multiple processors for increased computing speed or with larger memories for improved data access and handling. However, computing devices are generally not arbitrarily scaleable and most importantly, not at reasonable costs.
Parallel to the availability of computing devices with increased processing capabilities more and more networks interconnecting large numbers of computing devices emerge.
For example, local area networks such as company-wide networks interconnect computing devices of a company, or wide area network such as the Internet connecting computing devices virtually all over the world. These networks are increasingly used and have dramatically improved in the recent past. Improved communication schemes allow a more efficient communication and collaboration between computing devices interconnected over a computer network.
Thus, as an alternative to constructing a larger stand-alone computing device, as outlined above, for example with a larger number of processors, a plural number of computing devices may be used to solve a single computing task, also termed computing job. Such an interconnection of a plurality of computing devices over a computing network may be termed computing grid or data grid. A computing grid is a hardware and software infrastructure serving to handle computing jobs submitted by a user. The computing grid may interconnect distributed computers, storage devices, mobile devices, instruments, sensors, data bases and/or software applications. Generally a computing grid may comprise virtually any kind of computing device and includes a grid infrastructure to handle the distribution of computing jobs.
A system for handling computing jobs in a computing grid is described in “Sun grid engine 5.3 administration and users' guide”, Sun Microsystems, Inc. part no. 816-2077-11, April 2002, revision 01. This document describes a grid with a central grid infrastructure for handling the distribution of computing jobs in the computing grid. Upon receiving an instruction to distribute a computing job the grid infrastructure selects a suitable computing device and transfers the computing job to the selected computing device. The computing device then performs the computational operations necessary and returns the computational results to the source of the computing job. Accordingly, a user or application at a client device may issue an instruction to execute a computing job towards the grid infrastructure which in turn selects a suitable processing element and the processing results are ultimately returned to the client.
The above computing grid with a central grid infrastructure centralizes all intelligence for distributing computing jobs. However, this leads to a potential bottleneck in case of a very large number of computing jobs to be handled, as all computing jobs need to be distributed by the grid infrastructure.
Accordingly, a computing grid with a centralized grid infrastructure may not always be able to handle all computing jobs submitted in a computing grid.