1. Field of the Invention
The present invention relates to execution of applications provided in a communication system, and in particular, but not exclusively, to apparatus such as application servers handling execution of applications.
2. Description of Related Art
Various applications may be provided for a user provided with appropriate user device for communication via a communication system. A communication system is a facility which enables communication between two or more entities such as user terminal devices or other communication devices, network entities such as servers and other nodes. A communication system is typically adapted to operate in accordance with at least one communication protocol.
The applications are provided by data processing apparatuses such as application servers connected to the communication system and configured for execution of at least one appropriate software code. The applications may be any appropriate applications that may be provided via a communication system. Non-limiting examples of the applications include applications enabling communication of voice or video, electronic mail (email) applications, data downloading applications, multimedia applications and so on.
Typically an application is provided by a data processing device such as an application server provided with appropriate hardware for execution of an appropriate application software. The application may be executed in response to any appropriate event. In accordance with a non-limiting example an application is executed in response to a request for the application. For example, a user may request for a voice call or a video download from an application server where after the server responds the request by providing the requested application.
Data processing devices such as application servers typically handle vast amounts of events. Due to the lack of any human intervention the data processing devices execute the applications and respond events triggering an execution, in most instances in a quick and predictable manner. The handling of an event such as a request for an application is typically split into stages. For example, an application server may handle a message requesting for an application such that it first reads header information in the message and deals with other issues relating to the protocol used for the communication. Thereafter the server can process the actual message part, i.e. the payload that may need to be handled by the application.
The latter stages of handling typically comprise attaching the payload with an appropriate worker thread. After this the thread can enter an application/service providing function of the server, for example a function known as the “execution pipeline” of the server. The application providing function provides the resources for the actual execution of the application code and thus provides the requestor with the application.
Application servers may need to be capable of handling a great amount of requests and other events. A performance criteria most commonly applied to the application servers is the throughput. Throughput, however, is often not too problematic in high performance applications such as telecom-grade applications, in particular since the throughput is relatively easy to benchmark and regulate. The term telecom-grade application is commonly understood to refer to an application provided in a telecommunication system where higher performance in terms of throughput and latency may be demanded than what is commonly acceptable in the traditional information technology and/or so called web service applications. For example, a telecommunication operator may require that almost all requests are handled in a set time limit whereas the similar requirement in providing web services or similar may be lower. For example, an adequate cluster and a smart load-balancer can solve most of the possible throughput problems in nodes of a telecommunication system.
However, the situation is different if the traffic is burst in nature. Although the load-balancers and clustering may also be used in handling traffic bursts and related problems, through regulating the load of a single application server, they may not offer a proper working model to tackle bursts.
Also, although the throughput is generally accepted as the measure of performance, other factors such as the response time may also be important from the point of view of efficient operation. This may be in particular the case in telecom-grade applications where there typically are tight, typically operator specific, limits for maximum latencies in responses. Many of the less sophisticated systems may thus be forced to run in relatively low loads to ensure that no peak in the traffic or no internal event, for example a garbage collector in a Java virtual machine (VM), can delay a single request too much.
A way to solve latency issues is to increase the maximum throughput of an application server relative to the normal or average load. The thinking behind this is that if the load is kept below a certain level, then the extra capacity of the server should be able of handling each message within the set time limits. A result, however, is that the maximum throughput is increased to unnecessarily high levels. While this may provide a working model for most of the applications, the increase in the throughput capacity may also lead in excessive and possibly very costly hardware that is underused most if not all of the time.
The response time is becoming an increasingly important factor due to the increasing amount of time critical applications. In a time critical application a user, a network element or another requesting entity may expect that an application is executed without any unnecessary delays. Non-limiting examples of time critical applications include video streaming and packet switched voice calls.
Various models have been proposed for handling incoming messages in time critical applications. In a periodic handling of request and other events a fixed interval is determined for the handling of an event. The periodic model is relatively deterministic in its nature because the rate of execution is regulated by the period. However, the periodic model does not always fit perfectly into all applications, for example because the events may be created by different users who are using different terminals. The events can therefore be very asynchronous by their nature. Also, the length of the period has often to be set based on the worst case scenario, increasing the length thereof. This, in turn, may introduce extra delays and waste of resources.
Asynchronous handling is another way of handling incoming messages. Usually asynchronous message handler does some fast processing and then handles the message forward to a “worker thread”. The worker threads then generally dictate the behaviour of the asynchronous handling. A simple way is just to start a new thread for each received message. However, the asynchronous model does not always behave satisfactorily in bursts. For example, in a sudden burst the system may get very busy in creating and/or starting threads. Most of those threads will then wait for a critical system resource to become available. In a bad case scenario the first thread will be run as the last one of the threads in a burst. Individual messages can thus get huge latencies even when the average response time may appear to be satisfactory.
Thread pools have been introduced to prevent thread creation delay. In a burst a thread pool may, however, run out of threads and messages will have to wait until reserved threads are freed up. Thus, even with the thread pools the first message of a burst may end up waiting while the whole thread pool is emptied. Also, in a burst, a simple thread pool and asynchronous handler may give system a sudden leap from a situation where almost no threads are in use to a situation where all threads are in use.
Sporadic handling is yet another model that has been proposed for handling of messages received at an application server. The sporadic handling is named after the behaviour of spores of a plant. The sporadic handling is based on use of a predefined minimum delay that is always waited after processing of each of the messages before processing a next message. When compared to the asynchronous handling, the sporadic handling has an advantage of giving each message a little more time to execute before the handling of next message is to begin. Combined with thread pools this is believed to behave and perform better, than a handler that is based on asynchronous handling.
A downside of the sporadic handling is that the fixed delay is not always needed. Thus, when that fixed delay would not be needed it will only introduce unnecessary latency into the message handling. Also, in some applications and occasions the fixed delay may not even be set to be long enough. For example, two or more messages may arrive at the same time and the first one handled may become blocked by an input/output (I/O) operation right in the beginning. The other messages will then have to wait the sporadic delay even though the system itself could well execute it while waiting for the I/O operation to complete.