1. Field of the Invention
The present invention relates to an improved data processing system and, in particular, to a method and apparatus for multiple process coordinating. Still more particularly, the present invention provides a method and apparatus for process scheduling.
2. Description of Related Art
Modern operating systems support multiprogramming, whereby multiple programs appear to execute concurrently on a single computational device with a single central processing unit (CPU) or possibly multiple CPUs in a symmetric multiprocessor (SMP) machine. The appearance of concurrent execution is achieved through the use of serialized execution, also known as “time slicing”: the operating system of a device allows one of the multiple programs to execute exclusively for some limited period of time, i.e., a time slice, which is then followed by a period of time for the exclusive execution of a different one of the multiple programs. Because the switching between programs occurs so quickly, it appears that the programs are executing concurrently even though they are actually executing serially. When the time slice for one program is concluded, that program is put into a suspended or “sleep” state, and another program “awakes” and begins to execute.
One way of improving the performance of a single program or a single process is to divide the program or the process into paths of execution, often termed “threads”, that appear to execute concurrently. Such a program or process is typically described as “multitasking” or “multithreaded”; the operating system provides each thread with a time slice during which it has exclusive use of the CPU. Operating systems typically provide built-in mechanisms for switching between concurrent programs and/or threads in a very quick and efficient manner; some types of CPUs provide direct hardware support to an operating system for multithreading.
As threads execute, they invariably need to access resources within a data processing system, such as memory, data structures, files, or other resources. Resources that are intended to be shared by multiple threads must be shared in such a way to protect the integrity of the data that is contained within the resource or that passes through the resource; one way of effecting this is by means of serializing execution of threads that are competing for a shared resource. When a first thread is already using a resource, a second thread that requires the resource must wait until the resource is no longer being used, which would typically occur as a consequence of the first thread having successfully completed its use of the resource. Hence, an operating system allocates time slices to threads in accordance with their needs and their competition for resources rather than through the use of strictly periodic time slices.
In some situations, threads that are competing for shared resources may begin to require more time per work unit because the threads spend more time waiting for other threads to finish using a shared resource that is required to complete the processing of a work unit. On a server, for example, threads may respond more slowly to incoming requests as they wait for a resource that is required to respond to those incoming requests. In the worst situations, the threads may become deadlocked on a shared resource, in which case the threads may completely stop responding to incoming requests.
During these types of situations, there may be some actions that a system administrator can take to help alleviate the situation. For example, the system administrator could dynamically reconfigure a system in a certain manner by entering commands through a system administration tool or utility. Due to the software architecture of the data processing system, though, an action by the system administrator may be placed into a queue behind many other previously pending work units. Hence, the alleviating action of the system administrator is delayed because the threads are slowly processing the work units. In the case in which the threads have become deadlocked, the alleviating action of the system administrator becomes trapped in the queue; since the threads are no longer removing and processing work units from the queue, the alleviating action of the system administrator is never processed.
A particularly serious scenario in which a server's threads may become deadlocked is a denial-of-service attack, which are increasingly more common. The connectivity of the Internet provides malicious users with the ability to probe data processing systems and to launch attacks against computer networks around the world. Many computer security tools are commercially available that provide defensive mechanisms for limiting the ability of malicious users to cause harm to a computer system. An intrusion detection system can alert an administrator to suspicious activity so that the administrator can take actions to track the suspicious activity and to modify systems and networks to prevent security breaches. These intrusion detection systems typically only gather information about possible security incidents, and a system administrator must manually take certain actions to protect networks and resources from the malicious user during an intrusion or attack.
When a server experiences a denial-of-service attack, a large number of requests are directed at the server in a short period of time, e.g., requests for web pages from a web server, and the larger number of pending requests overwhelm the server because the requests are received much faster than they can be answered by the worker threads on the server. Depending on various factors, the server may crash, or the server may continue to respond to incoming requests, albeit in a seemingly much slower manner, because all of the server's worker threads are already busy with pending requests. In some cases, the threads may become deadlocked on a shared resource.
During a denial-of-service attack, there may be some actions that a system administrator can take to help alleviate the situation. For example, the system administrator could configure the server to stop servicing requests from particular clients as identified at particular addresses or to refuse certain authentication credentials. However, many data processing systems have a distributed architecture, and the system administration tools communicate with servers using the same request and response mechanism as is used by the denial-of-service attack. Hence, an action by the system administrator through an administrative application may generate a request to the server, and the system administrator's request is processed in the same manner as the requests from the denial-of-service attack. In that case, the system administrator's request may be placed into the same queue as previously pending requests, and the system administrator's request becomes trapped in the queue, either severely delaying the processing of the system administrator's request or, in the case of deadlocked threads, ensuring that system administrator's request is never processed.
Therefore, it would be advantageous to have a technique for alleviating denial-of-service conditions in a server. It would be particularly advantageous to alleviate these conditions in a manner such that the processing architecture can be extended to provide for the solution without disrupting the basic processing architecture.