With any computer system, whether multi-user or single user, it is necessary for the central processing unit to issue commands to sub-systems of the computer and to wait while these sub-systems accomplish their designated function and return control to the CPU. Efficient computer management requires that this waiting time, during which the central processing unit is essentially idle, be kept to a minimum. For example, in a single user environment the central processor might be waiting while a peripheral device such as a disk drive accesses its stored information. This entails steps that are repeated for each access. Since the instruction cycle time of the central processing unit is much less than the access time of the various peripheral devices, substantial time is saved if the number of calls on the device is reduced. Therefore, some mechanism is needed to minimize the number of independent calls upon the peripheral device for information. This may be accomplished for example by employing caches, which are simply portions of memory set aside to hold more information from the peripheral device than is immediately required. During a single access an intelligent guess is made of what information may be needed in a subsequent access and that information is stored in the cache. Subsequent access may quickly be made from the cache rather than having to return to the peripheral device.
In mainframe computers servicing multi-user environments, the opportunity exists for analogous savings at the task dispatch level. In this context a task is a request for immediate service by a user. Such a task may entail interaction with data base services of the computer. In particular, there are typically applications running on the mainframe, such as data processing applications and data base access applications in which control is transferred by the operating system to underlying routines in response to user requests. In each instance a certain amount of time is required to initialize routines and to access related peripheral equipment such as data storage devices. When these routines have several tasks to perform there is significant CPU overhead involved in passing control from the operating system and returning control. It is therefore advantageous that they execute these tasks at one time. On the other hand it takes time to wait for the multiple tasks to be assembled since they typically arrive at random times. This wait can impact response time. The ideal solution would be to batch as much work as possible in some minimal time frame that does not impact response time.
Systems have been developed wherein groups of similar requests are batched together before control is transferred to the underlying application. An example is the Multiple Region Batching Option Priority Dispatching Mechanism employed in connection with the IBM Customer Information & Communication System (CICS). A critical parameter for any such computer system is the number of calls for which the system will pause and wait for an accumulation before beginning to execute the calls in a batch. If this number is too small, little savings are achieved. If the number is too large, the system spends time waiting for the batch to become filled and the response time deteriorates. What is needed, is a system which can dynamically respond to the on-going activity in the computer in order to select an appropriate number of requests to be stored in a single batch.
Attempts to carry out this feature have to date, been unsatisfactory. IBM has tried to employ such an improvement in CICS, its on-line transaction processing system, which batches work before dispatch of its transaction processing system. This system employed a most straightforward attempt, i.e., it just set a fixed batch size. The problem with this system was that it did not take into account the amount of work outstanding nor the amount of time necessary to assemble a batch. Where the batch size was high compared to the amount of work that could arrive, a timer became the dispatch mechanism, defeating the advantage of batching by causing response time to degrade. The timer could only be set to a minimum of 100 milliseconds and defaulted to 3 seconds. A 100 milliseconds timer was too long and took a great deal of CPU resources to support. Where the batch size was set to a minimum, most if not all of the savings were eliminated.
IBM also provided an implementation where the batch size was set at a particular fraction of that user's outstanding file requests. This was however, unrelated to any time constraint or the nature of the events waited on. This system proved completely inadequate where the fraction required too many events to be completed when a large number of events were outstanding. It was dropped by IBM in its latest release.
Each of these prior art systems degraded response time excessively by either setting batch sizes without regard to outstanding work or relying on slow timers. The use of the standard timer mechanism in the IBM implementation also used excessive resources. In fact, setting the timer at the interval required for good response time, required 20-40 milliseconds would use up an order of magnitude more resources than the system could ever save. IBM therefore used a default, which however was many times too large to produce subsecond response times for typical applications. In current IBM technology the minimum timer length is on the order of 100 milliseconds, with a default to 3 seconds and is much too large.