FFT methods can facilitate the determination of the optimal global alignment of two DNA sequences. For example, Felsenstein, Sawyer, and Kochin, in “An Efficient Method for Matching Nucleic Acid Sequences,” Nucleic Acids Research, Volume 10, Number 1, pp. 133-139, incorporated herein by reference, describe a method of computing the fraction of matches between two nucleic acid sequences at all possible alignments. Benson, in Fourier Methods for Biosequence Analysis, Nucleic Acids Research, Vol. 18, No. 21, p. 6305, incorporated herein by reference, and in Digital Signal Processing Methods for Biosequence Comparison, Nucleic Acid Research, Vol. 18, No. 10, p 3001, incorporated herein by reference, describes similar methods. Cheever, Overton, and Searls, in Fast Fourier transform-based correlation of DNA sequences using complex plane encoding, CABIOS, Vol. 7, No. 2, pp. 143-154, incorporated herein by reference, describe yet another variation on the use of FFT methods for the correlation of DNA sequences. These methods all use a means of coding DNA sequences as 4 binary vectors or functions (0 or 1), one vector or function for each of the 4 different bases (A, C, G, or T).
Although FFT methods can facilitate the determination of the optimal global alignment of two DNA sequences, a need remains for an efficient system for detecting known blocks of functionally aligned amino acid sequences in a nucleic acid sequence, e.g., in an uncharacterized EST.