The present invention relates generally to the field of Input/Output data processing, and more particularly to a device for matching regular expressions in input data, where the regular expression is represented by a finite-state machine.
In various applications, such as network intrusion detection and text analytics, it is necessary to process input data streams, for example text documents. In this context, regular expressions, also called regexs, can be used to define search patterns. A regular expression may include one or more subexpressions. Back-referencing is a regex-matching feature that increases the expressive power of regular expressions by making it possible to refer back to a captured subexpression group as part of the regex definition. As a result, unlike standard regexs, which define regular grammars, regexs with back-references correspond to more powerful context-free grammars. However, software implementations of regex matching with back-references typically involve backtracking, and software-based backtracking implementations generally exhibit a low performance.