This invention relates to processor-based systems and, in particular, to an overload control process that adapts to a detected overload condition in the processor and adjusts the operating parameters of the processor to process service requests without endangering the continuing operation of the processor. This overload control is suitable for systems such as telephone switching systems.
It is a problem in the field of processor-based systems that the plurality of peripheral devices, which are in communication with the processor, generate service requests in a manner that can be highly variable. Existing processor overload management systems typically rely on parameters that are hard coded to guide the operation of the processor. However, these parameters render the processor overload management system immutable in operation and, as the nature of the peripherals changes, these parameters are mismatched with the operation of the processor. If the offered traffic load changes, then the operation of the processor is tuned for the wrong environment. Existing processor overload management systems can also shut down the processor in severe overload conditions and typically do not address the load presented to the processor by processes other than the primary service processes. These existing processor overload management systems are typically relatively slow to respond to overloads and/or of limited operational range since they depend on xe2x80x9csmoothedxe2x80x9d estimates of parameters that are used to estimate load on the processor. For example, occupancy (utilization) estimates may require several samples, each of which can be several seconds long. The utilization is estimated by maintaining a running average of these samples. As a result, these processor overload management systems are slow to react to rapid traffic changes (such as surges) and cannot clamp the overload before the adverse effects caused by the overload impact the processor.
Overload controls in telephone switching systems, such as the #5ESS switching system manufactured by Lucent Technologies, attempt to keep the processor that manages the call processing (Switching Module Processor in the #5ESS) running at a predetermined utilization. Since the amount of processing time needed to accept or throw out a work request is substantially less than the amount of processing time needed to process the work request (for instance, setting up a telephone call), existing overload controls usually do not control the amount of work accepted from peripherals. Typically, peripherals (such as line units) are polled to see if any work (such as callers going off-hook or on-hook) exists, and the work requests are time-stamped and queued in temporary storage queues. As stated above, the time required to poll the peripherals and queue the requests is a small fraction of the time required to process the desired work. This polling is usually done at a high priority and is done periodically. Once polling is halted, the work requests are xe2x80x9cmeteredxe2x80x9d out of the temporary storage and xe2x80x9crealxe2x80x9d work commences. The amount of requested work removed from temporary storage is determined by the overload control. This xe2x80x9crealxe2x80x9d work is usually done at a lower priority than the polling. Periodically, a scan is made of the queued work to see if any requests are queued too long; if such excessively delayed work is found, it is removed from temporary storage and discarded (this is sometimes called xe2x80x9ccleanup activityxe2x80x9d). If the cleanup activity takes too long, maintenance work may be scheduled to determine why a given type of cleanup is taking so long (for instance, a peripheral may be malfunctioning and generating false work requests). Since cleanup is believed to be a rare occurrence, maintenance work runs at a very low priority. Under normal call processing conditions, equilibrium exists: calls are set up, calls are torn down, call processing operates normally, there is no cleanup work or maintenance work initiated by excessively delayed cleanup work. The utilization of the processor is dominated by the real-time that is expended in setting up and tearing down calls. In the case where the offered work load increases, there is an increase in queue loading work which runs at a high priority and which slightly raises the processor""s utilization. The existing overload control reduces the rate at which (the temporary storage) queues are unloaded to compensate for this activity, thereby reducing call processing activity. This results in a commensurate increase in call setup delays. A new equilibrium is typically reached where the incoming call request rate is equal to the call setup rate, although the setup delays increase as a result. If the increase in offered load continues, then the time spent by each work request in queue becomes excessive and canceled work request cleanup activity is initiated. This cleanup work is counterproductive in that it represents extra work for the processor, increases utilization of the processor but does not result in more call completions.
In this processor overload management system paradigm, if the offered load increases rapidly (xe2x80x9csurgexe2x80x9d), then call processing can be momentarily terminated since the real-lime capability of the processor is dedicated to inputting call set up requests. In addition, the overload control is not in control of the processor since it is capable only of determining how many work requests are to be removed from the temporary storage queues. The processor is now operating at an extremely high utilization with much of the work load being high priority work (polling peripherals and moving work requests to temporary storage) that is processed, some lower priority cleanup work which is processed very slowly, and some call processing work. If the cleanup activity is successful in removing user requests for service, and if the users are impatient to obtain service, then they try again to obtain service (for instance, using automatic redialers). The processor now sees not only xe2x80x9cusefulxe2x80x9d work, but also user retries, cleanup work, and, possibly, maintenance work. Very few work requests are removed from the temporary storage queues since the majority of the processor""s time is spent accepting work requests, moving them to temporary storage, throwing these requests away, and running cleanup work. Thus, existing processor overload management systems can substantially reduce system performance in severe overload conditions. These existing processor overload management systems are typically relatively slow to respond to sudden overloads and/or of limited operational range. As a result, these processor overload management systems cannot clamp the overload before the adverse effects caused by the overload impact the processor.
The above described problems are solved and a technical advance achieved by the self-adaptive processor overload control system which provides real time overload control and is fast to respond to processing overload conditions. The self-adaptive processor overload control system can detect surges and also has a dynamic range that can address overloads of significant size. It matches software operation to the CPU instruction cache operation to thereby increase the processor efficiency by reducing the average real time needed to process call activity.
The self-adaptive processor overload control system maintains a counter for each peripheral, and sets a threshold value for each peripheral. The self-adaptive processor overload control system completely empties each temporary storage queue to obtain a higher cache hit ratio, since code to serve each request is queued in cache memory and when successive requests on the same nature save on code retrieval time. That is, all the work associated with a given peripheral class (xe2x80x9cclassxe2x80x9d being, for instance, line units, trunk units, and so forth) is processed before the next peripheral class""s work is done. The self-adaptive processor overload control system dynamically adjusts the maximum amount of work requests unloaded from a peripheral by starting low, then if the processor""s occupancy is low, rapidly increases this maximum value. If an overload condition is detected, then the self-adaptive processor overload control system significantly reduces the maximum number of work requests that can be unloaded from a peripheral to protect the processor. Once the overload condition has cleared, the self-adaptive processor overload control system resumes increasing this maximum unloading value. The overload is delegated outboard to the peripherals generating the overload of service requests rather than concentrating the overload at the processor.