(1) Technical Field
The present invention relates to a recognition system and, more particularly, to a neural network with engineered delays for pattern storage and matching.
(2) Description of Related Art
Pattern storage and matching is a rapidly evolving field that allows large databases to store and match digital data. A goal of the present invention is to improve the speed of pattern matching in digital data. Rapid search is needed in large data sets, like video and audio streams and internet traffic. For example, for intrusion detection in internet traffic, the state of the art is not fast enough to search for all known attack signatures at modern day internet router speeds.
For exact pattern matching, previous approaches focused on finding a string in a text. If wildcards are not allowed, then the Boyer-Moore (BM) algorithm implemented on a standard serial computer is still the state of the art (see Literature Reference Nos. 1 and 2). String search algorithms find matches of query strings within a text or input stream. The naive approach is to align the whole query string with the text starting from the beginning of the text and match each character in the query string with the corresponding character in the text. Then, the query string is shifted by one character and the matching process is repeated. This approach will find all matches in the text. However, the computational complexity is O(k n), where k is query size and n is the text size (number of characters).
A more efficient approach is to shift the query string by k characters if a character is encountered that is absent in the query pattern, since any intermediate shifts are guaranteed to result in a mismatch with the query. This strategy is implemented in the BM algorithm (referenced above), which is still the gold standard for exact string matching without wildcards. The average computational complexity is O(n/k) if the alphabet is sufficiently large, and the worst case computational complexity is O(n). However, the shift strategy fails if the query string contains wildcards.
An alternative is a finite state machine (see Literature Reference Nos. 3 and 4), which can deal with wildcards in the query string. Currently, the state of the art are deterministic finite automata (DFA), particularly, the Aho-Corasick string matching algorithm (see Literature Reference No. 7), which is O(n). This algorithm has been the standard method for more than 30 years. Finite automata search for strings by transitioning between states; this transition is regulated by the current input character. As preparation, a query string must be converted into a state machine, which can be time consuming. The Aho-Corasick algorithm extends the idea of finite automata to building a state machine that can search through several query patterns simultaneously. Theoretically, the speed is independent of pattern length and alphabet size (see Literature Reference No. 4). A disadvantage of DFA is that it requires an additional cost for building the state-transition table, which shows the state transitions depending on the input character, in preparation for the search. A state-transition table must be computed for every stored pattern that is to be matched against an input stream.
With respect to neural networks, the present invention employs a special case of time-delay neural networks (TDNN) (see Literature Reference No. 8). TDNNs are, however, conceptually different; instead of setting delays, in a TDNN the weight matrix of neural connections is expanded to include connections from previous time steps. Another instantiation of using delayed input can be found in recurrent neural networks, as, e.g., in the Elman network (see Literature Reference No. 9) which keeps a memory of previous hidden states.
In the context of recurrent networks, Izhikevich introduced the concept of polychronization (see Literature Reference No. 10). That is, time-shifted instead of simultaneous firing is critical for activating receiving neurons, because in real networks, connection delays are heterogeneous. Izhikevich demonstrated the phenomenon of polychronization in neural networks of spiking neurons that were described with several differential equations. Later, Paugam et al demonstrated a supervised learning approach to classify temporal patterns using a polychronous network (see Literature Reference No. 11). For this classification, they learned the delays between a layer of recurrently connected neurons and an output layer. Most other work set delays a priori.
All of the above neural models are computationally expensive. As a simpler alternative, Maier et al introduced the “minimal model” (see Literature Reference No. 12), which could exhibit polychronous activity without the complications of integrating differential equations.
Thus, a continuing need exists for a neural network device with engineered delays for pattern storage and matching that is based on the neuron model from the minimal model.