Suffix tree structure is very useful to support pattern matching as well as many other important string operations. It plays a key role in several types of applications that require complex string analysis and processing.
While a suffix tree can be constructed in time linear to the length of its input string, it always requires a large memory space for building and storing the tree structure. This means that if an input string is long and its large memory requirements cannot be satisfied, the performance of the suffix tree searching will degrade significantly. This space complexity is a great concern of applications that require processing on large data strings. These concerns are increased if the data is in binary format since suffix tree has to process data at the bit level, which is more complicated, as well as more space and time consuming.
To address the space complexity issue of the suffix tree structure to present binary strings, a number of solutions have been proposed to improve the efficiency in space usage by variant types of presentation and implementation [1]. However, the improvements achieved by these solutions are not great.
Alternatively, compression techniques have been proposed to reduce the space requirement for storing the tree structure. However, these solutions often result in some loss of functionality of the suffix tree [2].
There have also been efforts to support binary pattern matching by grouping bits into bytes ([3, 4] are examples). These works, however, are based on the modifications of Boyer-Moore algorithm to support pattern matching on the fly without taking the advantage of knowing the text in advance to build search indices.