1. Field of the Invention
This invention related to pattern matching, and more specifically, to a method for recognizing and classifying a data pattern in a data stream by combining distance based functions and fuzzy logic techniques.
2. Related Art
Automation is becoming increasingly important in today's world and lifestyle. This is evidenced by the growth of computer networks, automatic transaction and service machines, and the vast number of daily business transactions handled electronically. All of these various business related events are typically monitored for various reasons, such as accuracy, security, marketing, inventory, scheduling, and the like. Therefore, there is a need for computer software to quickly, efficiently, and correctly identify patterns in data streams.
There are two principal methods for recognizing and classifying a data pattern in a single data stream. These methods are Fuzzy Adaptive Resonance Theory &Fuzzy ART) and Feature Mapping. Both methods are well-known and well published in the relevant arts. In summary, Fuzzy ART determines how close two patterns match each other by calculating the closeness, or fuzziness, of the fit, e.g., two patterns are a seventy-five percent (75%) match. With Fuzzy ART, a user can set the acceptable value, or degree, of fuzziness for determining a match. Thus, Fuzzy ART monitors a data stream for patterns and groups them together based on the percentage of similarity.
One disadvantage with Fuzzy ART is the fact that data patterns degrade over time. Conventional pattern matching systems, including Fuzzy ART, represent known data patterns as organized nodes wherein each organized node maintains a set of attribute coefficients defining a specific known data pattern. Therefore, when a new data pattern is identified in a new data stream, the new data pattern is compared against the known data patterns as represented by the organized nodes. If the new data pattern matches an organized node, the attribute coefficients of the matching organized node are updated to reflect the new data pattern. Because data patterns degrade overtime, the attribute coefficients of the organized node corresponding to the data pattern also degrade overtime until the organized node no longer accurately represents the data pattern. Eventually, the system must create a new organized node to represent the data pattern. Therefore, there is a need for a computer based system that identifies and classifies data patterns which minimizes the recreation of new organized nodes.
In contrast to fuzzy logic techniques, Feature Mapping is based on distance measurements. When a first pattern is identified, the pattern matching system of Feature Mapping assigns the patter to a point in N-dimensional space. A user then defines a radius around that point, thereby defining a perimeter of a cluster that corresponds to a specific data pattern wherein the first data pattern is the centroid of the cluster. Therefore, if a second pattern falls within the cluster as defined by the first pattern, then the second pattern matches the first pattern and belongs to the same cluster. If a point defining another pattern falls outside of the cluster, then a pattern is detected resulting in a new cluster being formed. As a cluster is defined by various points falling within the set radius, the detail of each data pattern is not lost because each pattern is maintained as a separate point in the cluster. Also, this method stabilizes the pattern identified by the cluster by moving the centroid of the circle according to the points defining the cluster. Thus, Feature Mapping monitors a data stream for patterns and groups them together based on the distance from the centroid of the cluster.
A disadvantage of Feature Mapping is the determination of a cluster's radius. Conventional systems use an arbitrary initial radius which is adjusted based on trial and error. Therefore, Feature Mapping may not accurately reflect a known data pattern because the chosen radius of the clusters may be incorrect.
A second disadvantage of Feature Mapping is the ease in which two data streams containing the same data pattern are misclassified as two different data patterns (each a separate cluster) due to a single simple difference between the data streams. For example, if there is one data stream containing a data pattern in which the signal has a spike up at the signal's end and there is a second data stream containing the same data pattern but the signal has a spike down at the signal's end, under Feature Mapping, these data patterns are classified in two different clusters, thereby determining that they are two separate data patterns. However, based on this scenario, the data patterns should be classified as the same data pattern. Therefore, there is a need for a computer based system that identifies and classifies data patterns which handles minor discrepancies between data patterns without identifying and classifying such minor differences as a new data pattern.