This invention relates to the field of digital data sorting circuits and techniques, and more particularly to the field of data sorting circuits used to accomplish software performance analysis by counting the occurrence of address calls within a large number of address ranges of varying sizes.
In analyzing the performance of a computer system as it executes a program, it is frequently desirable to be able to count the occurrence of each of a variety of different digital patterns as rapidly as possible with a minimum of hardware. For example, it may be desirable to be able to determine the amount of time that the program spends within different address ranges as it is executed. With such a capability, it may be possible to improve the efficiency of a program by analyzing how much time the program spends within each subroutine as the program is executed and making appropriate enhancements to eliminate unnecessary inefficiencies in those parts of the program that are most frequently invoked.
Prior art efforts to provide this sort of resource have included a variety of approaches. If the results do not need to be immediately available or perfectly accurate, a statistical, non-real-time approach can be effectively used. The address (or other) data present on the bus of a microprocessor can be stored at regular sample intervals. During the time between samples, the last sample data element can be compared by software with the address ranges of interest and a counter incremented to reflect the result. Such an approach is described in co-pending U.S. patent application by Clark et al., Ser. No. 06/812,085, abandoned. While the method described in that application will work for a large number of ranges and requires only a minimum of dedicated hardware, the non-real-time, statistical nature of the result is unsuitable for many applications.
Another approach, but one which is much more hardware intensive, is to dedicate a word recognizer and a counter to the task of recognizing each individual pattern and counting its occurrences. However, for a large number of such patterns, this is prohibitively expensive, since additional circuitry is required for each pattern of interest. Moreover, because each address or data pattern is monitored separately, the counts for each address in a range of addresses have to be added together to produce a total for the whole range.
This approach can be made somewhat more efficient by using only one counter to do all of the counting but storing separate sums for each different pattern. Such a technique is described in a paper by Steven Kerman, entitled: A Facility for Analyzing Microprocessor System Performance, published in the Digest of Papers, IEEE Compcon, 1979. In this system, a large number of counters are simulated by one counter and a random access memory. During the occurrence of each event the counter counts clock pulses and its concluding count is added to a stored value in memory. The same adder is successively employed to update many different memory locations. While this approach is more effective than a multiplicity of counters, it is still unnecessarily hardware intensive and inefficient because individual addresses are being monitored and the results summed to produce results for a range of addresses.
An improvement to this approach is disclosed in U.S Pat. No. 4,774,681 to Frish for a "Method and Apparatus for Providing a Histogram", Sep. 27, 1988. Frish improved on the method of Kerman by eliminating the adder and substituting a linear feedback shift register for the conventional counter. By eliminating the adder, significant time savings are made possible, increasing the maximum speed of operation attainable. Similarly, the substitution of the linear feedback shift register for the conventional counter also produces some time savings. However, this is still the same basic approach, and it still suffers from the problem of having to sum the individual address counts in order to obtain a count of the occurrence of data elements over an entire range of addresses or other data elements.
If only a coarse binning of data is required, the approaches just discussed can be applied only to higher order bits of the data and results obtained for the symmetrical ranges of lower order data that each of these then represent. It is, however, frequently desirable, especially when monitoring program execution during software performance analysis, to be able to define non-symmetrical data ranges. Such non-symmetrical definitions are necessary, for example, to monitor the time spent within different subroutines of varying sizes. One way to monitor non-symmetrical address ranges is to dedicate a number of programmable range recognizers, such as those disclosed in "Programmable Range Recognizer for a "Logic Analyzer", U.S. Pat. No. 4,475,237 to Glasby, and count the occurrences of their output signals. A somewhat similar approach to range recognition is shown in U.S. Pat. No. 4,692,897 for an "Arrangement for Dynamic Range Checking or Matching for Digital Values in a Software System". Both of these approaches, however, require proportionally increased hardware resources to monitor a large number of address ranges.
What is desired is a way of accomplishing software performance analysis that operates rapidly to count each occurrence of data elements within each of a large number of data ranges, including data ranges of arbitrary size, and that requires a proportionally decreasing amount of dedicated hardware resources as the number of data ranges to be monitored is increased.