1. Field of the Invention
This invention relates to the sampling of digital counters and more specifically to the dynamic adjustment of sampling times to achieve improved statistical accuracy.
2. Description of the Prior Art
The use of counters to collect various data concerning the operation of communication networks has become important in the management of such networks. For example such counters monitor the operation of the communication function in order to detect degradation, impending failure, and to detect the failure of the various network components and links.
As implied in the references mentioned below, many computer systems collect statistical data on job interarrival times, response times, disc and RAM accesses, CPU and other resource utilizations, job categories, packet sizes, etc. The data can be collected through the use of counters that are periodically sampled. When full probability distributions of such measures of system performance are required, a set of counters that are incremented in a cyclic or round robin fashion may be constructed. These counters are then sampled at a frequency which is the same as or faster than the incrementing cycle in order not to lose data.
IBM Technical Disclosure Bulletin, Volume 33, Number 6B, November 1990, pages 72-75, teaches a technique for counting error events over a fixed interval of time T. The count n is compared to a threshold N in order to manage a network by issuing alerts or alarms when certain criteria are met. The article does not discuss whether the error event sampling is to be in hardware or microcode, nor does it address resource utilization.
IBM Technical Disclosure Bulletin, Volume 34, Number 4A, September 1991, pages 51-52, teaches a hybrid hardware software implementation having a limited set of internal hardware counters which actually count the error events. Microcode is used to programmably connect the counters and to accumulate the counts into main storage. When a counter is half full, and/or when the event being counted is to be changed, the microcode adds the count to the appropriate field in main store and possibly switches the event being counted. The microcode can thereby use the same counter in timeslice mode to monitor more than one event. The timeslice period is also programmable but there is no mention of dynamically varying sample periods.
IBM Technical Disclosure Bulletin, Volume 35, Number 7, December 1992, pages 103-107, teaches a two stage method for managing link performance counters. One stage compares a count of errored seconds kept over a short period of time with a threshold value as described in the first reference mentioned above. The second stage accumulates and transforms data gathered over a long period of time by repeated operation of the first stage in order to detect the onset of subtle long-term transmission impairments without giving false alarms. There is no mention of dynamically varied sampling periods or statistical accuracy.
U.S. Pat. No. 4,996,871 teaches re-adjusting the frequency at which signals are sampled in response to any change in the phase of the velocity signal so that the sampling frequency remains an integral multiple of the fundamental frequency of the signals being sampled. Although this reference changes the frequency dynamically, the signals being sampled are not stochastic and therefore the method being used will not function properly in the applications wherein the invention finds utility.
A problem that exists with such monitoring systems is that the arrival of data is stochastic. If the system designer chooses a sampling period that will not adversely affect the system operation by sampling too often, there is a risk that the sampling may be too slow during busy periods, thereby seriously reducing the statistical accuracy of the distribution estimate. If the counters are sampled at a faster rate, the sampling itself begins to adversely affect the performance of necessary operations that take place while data is not arriving at a fast rate, but these operations may well be critical to the job being measured and therefore again reduce accuracy of the measurements.