1. Technical Field
The present disclosure relates to text pattern matching and more specifically to regular expressions with multi-strings and intervals.
2. Introduction
A regular expression specifies a set of strings formed by a pattern characters combined with concatenation, union (|), and kleene star (*) operators, e.g., (a|(ba))* is a string of a's and b's where every b is followed by an a. Given a regular expression R and a string Q, the regular expression matching problem is to decide if Q matches any of the strings specified by R. This problem is a key primitive in a wide variety of software tools and applications. Standard UNIX tools such as grep and sed provide direct support for regular expression matching in files. Perl, Ruby, and Tcl are just a few of the many languages designed to easily support regular expression matching. In large scale data processing applications, such as internet traffic analysis, XML querying, and protein searching, regular expression matching is often one main computational bottleneck. Longer, more complex regular expressions typically require increased processing, but may be necessary for solving real-world problems. Thus, the field of regular expression processing welcomes any performance or speed increase which would help in analyzing strings for particular patterns, especially for lengthy, complex regular expressions.