Pattern matching may be defined as an activity which involves searching or scanning any type of data which can be stored or transmitted in digital format. A common type of pattern matching is searching for text in a file. Construction of a “machine” for pattern matching can be relatively intuitive. For example, given the pattern, “abc”, we would look at every character in the file, initially expecting an ‘a’. If found, we would then examine the next character, expecting a ‘b’. If a ‘b’ was found, we would then expect a ‘c’. If, at any point, we do not find what we expect, we return to the expectation of an ‘a’.
We have just begun to describe a finite state automaton—which generally comprises the following five components:                1. a finite alphabet (e.g. the ascii characters);        2. a finite set of patterns (e.g. “abc”, . . . );        3. a finite set of states (e.g. one for each of ‘a’, ‘b’, and ‘c’.); For each pattern, we may also define a final (“accepting”) state, which we enter upon having matched that pattern (e.g. “abc”);        4. one designated initial state; and        5. a move function that defines how the automaton changes state as it processes an input stream (described above.)        
Parallelism in Pattern Matching
The notion of parallelism in pattern matching has to do with subpatterns, in particular, subpatterns of the type in which one or more consecutive elements, starting with the first element, occur (in sequence) in a second pattern. There are typically two ways in which this can manifest:                Case 1. The first N elements of Pattern1 are also the first N elements of pattern2,        (N>=1). For example, “air”, “airplane”.        Case 2. The subpattern consisting of the first N elements of pattern1 appears in        pattern2, but does not include the first element of pattern2.        For example, “eel” and “feeler”.        
Finite state automata (fsa or state machines) are typically represented as directed graphs (also called state transition diagrams). This type of diagram preferably has a root node, which represents the initial state, and edges (or transitions) connecting the nodes, and labeled with the input which will trigger each transition.
An existing pattern matching algorithm is that developed by Aho & Corasik and later improved upon by Commentz-Walter. The Commentz-Walter method is commonly known as fgrep. Fgrep uses hashing to skip over areas in the text where no matches are possible. All commonly implemented methods of pattern matching use either the original Aho & Corasik implementation of the finite state automaton or the fgrep method of partial FSA implementation.
There is a need, however, to simplify the FSA, making it so fast that it is as good as hashing or other skipping methods in the regions without matches, yet faster than Aho & Corasik where matches are found.