1. Field
The present invention lies in the field of computing in multi-resource computing environments, and in particular relates to workload handling and scheduling in cloud computing.
2. Description of the Related Art
There is a growing trend toward using software installed and running on internal or remote data centre facilities rather than having software installed locally on each and every machine which may need to utilize the functionality provided by the software. This trend for ‘on-demand’ or ‘cloud’ computing services can provide a platform for a more efficient use of resources than is achieved by the local installation and execution of software. Advantageously, economies of scale can be realized when managing large sets of computers in a centralized manner as opposed to working on a per-machine basis. Cloud computing and software as a service (SaaS) are two widely known and adopted manifestations of this model. While cloud computing is a computing paradigm which is utilized and implemented in many forms, including providing infrastructure as a service, and is enabled by different platforms, the generally accepted definition is the delivery of computing power as a service rather than a product. Accordingly, pricing models for cloud computing infrastructure services can be structured in a similar manner to that of an utility. The same model applies in the case of SaaS, where what is being delivered to the user is the functionality of a software rather than just raw computing power. Both cloud computing and SaaS can be utilized to reduce the total IT costs (in terms of finances and/or CO2 generation) and in particular to lower the barriers to entry, as the upfront and maintenance costs for both software and infrastructure are very small or even non-existent.
Cloud computing services may be provided by data centers, in which a large number of networked computing devices work in cooperation with one another to collectively handle the workload placed on the data centre by requests from clients/users. The computing devices in the data centre may not all be identical, and some may have components (computing resources) particularly suited to executing certain applications rather than others. Furthermore, using certain of the computing resources in the data centre may have different cost implications to using certain other resources. Therefore, some degrees of freedom exist in allocating tasks (which for the purposes of this document shall be considered to be executing an application) to different configurations of computing resources.
Previous work done towards optimizing the execution of applications can be divided in two categories:                Cloud applications which scale up/down as function of the number of users/tasks. Such optimization is generally only applicable to task-parallel applications. However, applications of increased complexity, like the ones used in the fields of scientific and technical computing, do not benefit from such optimization techniques.        Self-optimizing middleware systems. Such systems make use of monitoring tools (e.g. performance counters) in order to produce a model of the execution of the applications, which is then used to manage the application in a way which maximizes a specified utility function. Such systems do not measure the scalability of an application.        