1. Field of the Invention
The present invention relates to biological sequence matching. In particular, the present invention relates to systems and methods for scalable streaming processing of rigorous matching algorithms.
2. Description of Related Art
Biological sequence matching is a computational method for obtaining biological information. Sequence matching is used to determine if a biological sequence belongs to a known family of sequences, such as DNA sequences and protein sequences. In biological sequence matching, a comparison is made of a given sequence, such as a queried sequence, with sequences usually taken from a database of known sequences; for example, GeneBank®, LocusLink and/or UniGene at the National Center for Biotechnology Information at the National Institutes of Health
There are two classes of algorithms in biological sequence matching: “rigorous matching algorithms” which perform full combinatorial comparison, and “heuristic matching algorithms”′ which reduce the number of comparisons using heuristic processes. Two types of rigorous matching algorithms include Hidden Markov Model (HMM) algorithm and the Smith-Waterman (SW) algorithm. Rigorous matching algorithms, contrary to heuristic matching algorithms, cover all potential combinations, and thus may require high computational complexity and large memory.