The size of genomic sequence databases are currently growing at an exponential rate. Today, a typical genomic (DNA) database contains the sequence for billions of nucleotides. As a result, searching a genomic sequence database to find a sequence that correlates to an query sequence is often extremely computationally intensive.
Various sequence similarity searching algorithms are currently used to identify DNA and amino acid sequences within a genomic sequence database that have a correlation to a query sequence. Generally speaking, sequence searching algorithms may be classified as one of two types, namely global comparison methods and local comparison methods. Global comparison methods search a database sequence for the occurrence of an entire query sequence. Although such methods have a high degree of accuracy, they tend to be extremely slow. On the other hand, local comparison methods, such as NCBI-Blast and FastA, are faster than global comparison methods. Local comparison methods identify similar subsequences based on similar k-tuples of nucleic or amino acids.
Local comparison methods typically employ dynamic programming evaluations in order to find an optimal solution to a given search. Not surprisingly, the significant contributor to search time for such local comparison algorithms is the dynamic programming evaluations. Indeed, dynamic programming evaluations account for about 76% of the processing time of a typical NCBI-Blast search. Thus, although local comparison methods may be faster than global comparison methods, such methods are still quite computationally intensive.
A local comparison method is therefore sought that is less computationally intensive than methods such as NCBI-Blast and FastA. An apparatus is desired that performs a local comparison algorithm in less time than the search time of current searches employing local comparison methods.