The present invention generally relates to cloud computing, and more specifically to managing the provisioning of new service request in computer systems, especially cloud-based systems.
In general, cloud computing refers to Internet-based computing where shared resources, software, and information are provided to users of computer systems and other electronic devices (e.g., mobile phones) on demand. Adoption of cloud computing has been aided by the widespread adoption of virtualization, which is the creation of a virtual (rather than actual) version of something, e.g., an operating system, a server, a storage device, network resources, etc. Cloud computing provides a consumption and delivery model for information technology (IT) services based on the Internet and involves over-the-Internet provisioning of dynamically scalable and usually virtualized resources.
Cloud computing is facilitated by ease-of-access to remote computing websites (via the Internet) and frequently takes the form of web-based tools or applications that a cloud consumer can access and use through a web browser, as if the tools or applications were a local program installed on a computer system of the cloud consumer. Commercial cloud implementations are generally expected to meet quality of service (QoS) requirements of consumers and typically include service level agreements (SLAs). Cloud consumers avoid capital expenditures by renting usage from a cloud vendor (i.e., a third-party provider). In a typical cloud implementation, cloud consumers consume resources as a service and pay only for resources used.
Cloud services, and especially PAAS ones (platform as a service) are becoming commonplace. Some platforms offer free services, low-price services, and paid services. A mix of such services is also possible. This creates an environment that may be prone to attacks as well as unintentionally created overload. A key to managing such a complex environment is the ability to protect against misuse and abuse.
When operating a provisioning system on cloud shared resources, certain service requests may be problematic in terms of the resources they require. For example, a submitted analysis request to a graph analysis service may run for a long time or produce many results and slow down the system as a whole. Ideally, the resources needed for a service request are known when the request is submitted. For example, for a relational databases (RDBs) queried with SQL, one can obtain a total run-time estimate from the query optimizer. This may also be the case for other services that perform a sequence or collection of actions whose individual costs are known in advance. However, for many services, the duration and resources required may become apparent only as the service request is being fulfilled. This induces an element of uncertainty and may result in inefficient provisioning, especially when some requests are malicious or inadvertently turn out to be high resource consumers. Computer time is not the only resource of interest. Space for storing intermediate results is also an important parameter as are the number of processors required (in a multi-core or GPU-based systems). There may be other restrictions on the kind of utilized resources.
The problem of provisioning of cloud services is particularly challenging when computing resources are shared between many users of various organizations. In addition, not all of these users are paying for the service (for example, some may use promotions) and hence it is not always possible to limit their usage strictly via financial means.
Practically, there are a several reasons why this provisioning of cloud services is important:                Expected resource usage may dictate the amount of resources to be allocated to a task. This is especially important in a cloud-based environment where additional resources may be dynamically allocated to the task. This is especially important in a multi-processor environment in which running out of time may put the job at the back of the queue for a long period of time often waiting while waiting for a sufficient number of processors to become available.        Very long running service requests may be used to overload a system by maliciously denying service to others.        The cost to provision graph queries and many analytic tasks are difficult to predict. These costs include time, processor usage, memory space and expense.        
Accordingly, a need exists to overcome the problems of provisioning service requests, such as cloud services requests, in a computing system as described above.