This invention relates generally to a method and apparatus for performing pattern recognition analysis and, more particularly, is directed to a method and apparatus for performing pattern recognition analysis on substantially periodic and/or recurring signals.
In various technologies, it is necessary and desirable to monitor and detect abnormalities in periodic and/or recurring signals. For example, in cardiology, a Holter type tape recording device has been used to record electrocardiographic (EKG) data over a 24 hour time span. During this time span, there may be 100,000 heartbeats. Since approximately 98% of all heartbeats are normal, diagnosis is often based on the existence of only one or two abnormal beat types, occurring perhaps only a total of a half-dozen times on the tape.
Conventionally, recognition of a periodic waveform, such as an EKG waveform, has been performed by comparing the EKG waveform to a fixed or standard waveform, for example, as disclosed in U.S. Pat. No. 4,023,276. In order to determine the fixed waveform, U.S. Pat. No. 4,124,894 teaches the detection of EKG signals from a patient to generate a model waveform for that patient which is resolved into contour intervals, for example, PR, QR, ST and T intervals, to establish a set of interval limits for each patient to which samples of subsequently monitored waveforms are compared. The waveforms may be aligned with their fiducial or starting points located at the centrodes of the EKG waveform, for example, as described in U.S. Pat. No. 4,170,992. U.S. Pat. No. 3,874,370 discloses the continuous recirculation of existing EKG waveforms to update the reference waveform.
After the reference or standard waveform is fixed, known devices utilize complex mathematical analysis in comparing subsequent EKG waveforms to the standard waveform. This includes, for example, an amplitude analysis whereby correlation is performed by determining the normalized area of non-overlap, as disclosed in U.S. Pat. No. 4,170,992. See also U.S. Pat. Nos. 4,211,237 and 4,456,959.
Such complex mathematical analysis may require the use of a main frame or mini-computer, as disclosed in U.S. Pat. No. 3,504,164, and has typically been prohibitively expensive in terms of execution time and hardware costs, generally using a large amount of memory. A further deficiency with such mathematical analysis is that the computations are generally not performed in a real time mode. In other words, the computations occur at a time much later than the occurrence of the event, that is, after the EKG waveform has occurred. When dealing with periodic signals, such as EKG waveforms, where it is imperative to have up-to-date data, such mathematical analysis may not be satisfactory. For example, in a critical care situation involving a patient continuously connected to an EKG machine, certain types of abnormal EKG waveforms may immediately precede massive heart damage or death. Quick recognition of such waveforms, embedded among long sequences of normal or non-threatening waveforms, is therefore desirable.
In addition, and related to the problem of real time analysis, it would also be desirable to characterize the different waveforms, so that easy and ready recognition thereof can be achieved. For example, if one deviated EKG waveform is generated more often than it should, this could be an indication of a heart problem. With such complex mathematical analysis, this becomes practically impossible.
Other applications in the EKG and other pattern recognition fields are disclosed in U.S. Pat. Nos. 2,648,822; 3,616,791; 3,654,916; 3,755,783; 3,779,237; 3,821,948; 3,878,832; 3,903,874; 3,940,692; 4,211,238; 4,417,306; 4,446,872; 4,453,551; and 4,466,440.
Another field where it is desirable to monitor periodic waveforms is with rotating machinery, such as gas turbines and high speed motors, where catastrophic component failure is typically preceded for a short time by slight changes in the shape of the characteristic acoustical waveform of the machine. In such case, the detection of failure-indicative acoustical patterns could be used to shut down the affected machine in an orderly manner while the failure was minor or limited.
Still another field where it is desirable to monitor periodic waveforms is with oil well dipmeter logs, where it is necessary to interpret readings of physical properties of the ground surrounding a bore hole as a function of distance down the bore hole, to detect whether hydrocarbons are present, whether there is a change in tilt of underground formations, and the like. In some cases, miles of readings may be taken from several sensors at once at intervals of an inch or less. Such logs must then be analyzed.
The same problems of monitoring and detection discussed above with respect to EKG waveforms apply with equal force to rotating machinery, oil well dipmeter logs and the like. Thus, in each field, it is necessary to recognize problems from raw data which contains relatively small but significant amounts of diagnostic or predictive information of interest.
The use of pattern recognition techniques to detect similar waveforms is also known. Generally, with pattern recognition techniques, the first step is to detect and establish the patterns and pattern boundaries (and possibly the fiducial or starting points). As a common alternative, Fourier transform snapshots are periodically derived from the input signal, which implicitly provide pattern boundaries consisting of frequency limits and a fiducial point consisting of the zero or center frequency. Such location of pattern boundaries may be performed interactively and iteratively with known pattern recognition algorithms.
However, use of a comparison operation in the first step is typically prohibitively expensive in terms of the execution time and hardware cost. Thus, continuous, simultaneous cross-correlation between an incoming signal and many possible matching patterns, for example, using a least squares difference operation, is computationally straightforward, but also involves a great deal of computational expense. Furthermore, cross-correlation does not necessarily yield useful comparison indices between patterns. This is true, for example, when patterns are shifted with respect to each other by a large amount without altering the general shape of the pattern. In such case, prior to detecting the pattern features, it is necessary to first detect such shifting of the pattern. Because of these difficulties, pattern boundary detection is most often performed by some type of simple threshold-crossing detection, rather than by pattern matching. As a result, such detection is effected by random noise and the like.
The second step in pattern recognition involves the derivation of sets of indicative parameters or features of the isolated pattern to be detected, generally by computational and comparative hardware and software. A feature of a pattern can be considered as a connected subset of the pattern, for example, the T wave in an EKG pattern, or a notch in a turbine's acoustical signal. The features or parameters of a pattern combine to form a "signature" of the pattern. Generally, however, the types of signatures that can be created have been constrained by the comparison methods available or specially designed for the third step.
The third step in conventional pattern recognition techniques, is the step of generating an index of similarity between the isolated pattern to be detected and previously defined patterns. However, conventional techniques are still relatively primitive. Generally, a specially developed cross-correlation, comparison algorithm must be developed for the particular application. This has been detrimental due to the difficulty in developing such algorithms and dissatisfaction with the end performance of the pattern matching system.
Although generalized comparison algorithms have been generated for the third step, known as "rule-based" algorithms, such algorithms employ too may voting ("if" statements) and branching schemes to detect whether the parameter or features derived in the second step are defined by certain limits. Specifically, this type of comparison algorithm examines target patterns according to certain predefined operations to match up the new pattern, that is, casts ballots for appropriate patterns, or program branches must be followed before continuing evaluation. In sum, "rule-based" pattern matching systems are notoriously difficult to create, maintain and modify, and are non-generic by definition, such that rules from one application cannot be carried over to another application.
An overall shortcoming of all pattern recognition systems is the absence of a coherent method of representing continuous features when detecting discrete features, and vice versa, in such a way as to allow construction of a fast and simple way of comparing patterns represented as groups of such features. When discussed herein, a discrete feature indicates the existence or absence of a parameter in a pattern, such as the T wave in an EKG pattern. A continuous feature, on the other hand, is a quantifiable aspect of the pattern, for example, the width of the T wave in an EKG pattern.
As an example, in pattern recognition applied to EKG waveforms, it may be desirable to represent the presence or absence of specific discrete features, such as the Q wave. However, discrete information of this type is not preserved by continuous pattern comparison algorithms, such as those that use cross-correlation. In like manner, for example, for "rule-based" algorithms used to detect discrete features, it may be desirable to detect certain continuous features, such as the duration of the discrete feature or the like. However, continuous information of this type is not easily handled by "rule-based" algorithms. Because of such considerations, there has generally been employed awkward and hard to manage marriages of "rule-based" algorithms and analytical algorithms. Such systems, however, are easily fooled by new pattern variations, artifacts or noise.