1. Field of the Invention
This invention relates to neural networks, and more specifically, to a self organizing neural architecture that recognizes and learns patterns, and correlates the patterns over time thereby predicting the future occurrence of patterns.
2. Related Art
Corporations and organizations are overwhelmed with the quantity of data they collect because nearly every business transaction is monitored. Transactions, or events, are monitored for various reasons, such as, security, marketing, determining inventory or schedules, tracing travel routes, trending analysis, among others. Each event has data associated with it which is monitored by various data sources including, but not limited to, news media, satellite transmissions, and/or electronic/digital transmissions. Therefore, an event is described by one or more data sources, and each event can be correlated to other events. A problem arses however, when unrelated data is to be correlated
The current method for analyzing a large collection of unrelated data is very inefficient and ineffective. First, a user decomposes all of the monitored data into one or more data streams, wherein each data stream represents a separate, well-defined area or type of data from a single data source. Once the decomposition is complete, the user assigns one or more data streams to an analyst for review. The analyst reviews the data streams assigned to him and identifies and locates one or more patterns, or events, in the data streams. Second, after all of the patterns are identified, the analyst attempts to correlate the patterns found within the data streams assigned to him with the patterns found in other data streams by other analysts. Therefore, under the current method, multiple persons review separate pieces of data in hopes to locate one or more patterns and correlations between those patterns across the multiple data streams.
To further complicate the process, a priority is often assigned to specific data streams. An experienced or senior analyst is then assigned to review one or more high priority data streams, and a junior analyst is assigned to data streams of low priority. This assignment, however, does not always identify the important data patterns because the important data patterns or correlations of data patterns are most often located in data streams of low priority or across multiple data streams. Therefore, under the conventional method, there is a high probability that important data patterns and correlations of data patterns will be overlooked and not identified.
One obvious disadvantage with the current method of determining data patterns and their correlations with other data patterns is that people are reviewing the data. Due to the ever increasing quantity of data being monitored and available for analysis, there is an increasing need for data analysts, especially for qualified and experienced data analysts. The workforce, however is diminishing, such that it is getting, and will continue to get, more difficult to find people to be data analysts. Therefore, there is a need to automate the review and analysis of data, comprised of multiple, unrelated data streams, so that one analyst can manage a large quantity of data.
A second disadvantage with the current method is that when multiple persons review multiple data streams, there is a high probability that they will miss a large number of correlated or concurrent events between the data streams. There is a loss of connectivity between the data streams. Therefore, there is a need for a way to correlate the events within one data stream to the events within another data stream whether the data streams occurred at the same time or at different time periods.
A third disadvantage with the current method is that current software technology is only available for the first step of the problem; that is, there are currently at least two methods for recognizing and classifying a data pattern in a single data stream. These methods are Fuzzy Adaptive Resonance Theory (Fuzzy ART) and Lead Clustering. Both methods are well-known and well published in the relevant art. In summary, Fuzzy ART determines how close two patterns match each other by calculating the closeness, or flizziness, of the fit, e.g., two pattern are a seventy-five percent (75%) match. With Fuzzy ART, a user can set the acceptable value of fuzziness for determining a match. Thus, Fuzzy ART monitors a data stream for patterns and groups them together based on the percentage of similarity.
In contrast, Lead Clustering is based on distance measurements. When a first pattern is identified, it is given a point in N-dimensional space. A user then defines a radius around that point such that if a second pattern fails within the radius of the point corresponding to the first pattern, then the second pattern is a match, or belongs to the same cluster, of the first pattern. If a point defining another pattern falls outside of the radius, then a new cluster or pattern is detected. As a cluster is defined by various points, this method can also stabilize the pattern associated with the cluster by moving the centroid of the circle according to the points defining the cluster. Thus, Lead Clustering monitors a data stream for patterns and groups them together based on distance.
Both Fuzzy ART and Lead Clustering only work when analyzing a single data stream for known data patterns. These current methods cannot correlate identified data patterns over time. Therefore, there is a need for a system that identifies patterns in multiple data streams and correlates those data patterns over time, such that predictions can be made upon the occurrence of a known data pattern.