A pairwise global alignment problem was defined by S. B. Needleman and C. D. Wunsch, “A general method applicable to the search of similarities in the amino acid sequence of two proteins,” Journal of Molecular Biology, 1970, pages 443-53, volume 48) (“Needleman and Wunsch”), which is hereby incorporated herein by reference in its entirety. Needlemen and Wunsch provide a dynamic programming (DP) approach to the global alignment problem, as follows. Given two strings S1 and S2 representing respective molecular sequences, a global alignment A between S1 and S2 is obtained by first inserting chosen spaces (or dashes), either into or at the ends of strings S1 and S2, and then placing the two resulting strings one above the other so that every character or space in either string is opposite to a unique character, or a unique string in the other string. (Note that possible alignments include alignments in which no spaces are inserted in either string, no space is inserted in one string and one space is inserted in the other string, or vice versa, and in which one space is inserted in each string.)
An example is given with reference to Table 1 below. If S1=“cacdbd” and S2=“cabbdb”, S1 and S2 are aligned with an alignment A given in Table 1 below.
TABLE 1S1′cac—dbdS2′cabbdb—Position i1234567
The i-th character of S1′ or S2′ in Table 1 above is in position i of the alignment A. Of all the alignments that are possible between two strings S1 and S2, the alignment that maximizes the matches between characters in corresponding positions of the respective strings is referred to as an optimal alignment.
The global alignment problem, as defined in Needleman and Wunsch, and paraphrased above, deals with the computation of all possible optimal alignments between two strings S1 and S2.
A variation of the global alignment problem is the k-difference global alignment problem. The k-difference global alignment problem deals with the computation of all optimal alignments between two strings S1 and S2, such that the number of mismatches in the reported alignment is at most k. The k-difference global alignment problem is described in section 12.2.3 of Gusfield, D., “Algorithms on strings, trees, and sequences: Computer Science and Computational Biology,” 1997, Cambridge Publishers, which is hereby incorporated by reference in its entirety.
Global alignment criteria have application in various fields. One particular area concerns sequencing biological data. Existing computational techniques associated with alignment problems are not wholly adequate for such biological applications. A need consequently exists for improved computational techniques suitable for use in biological applications.