Dynamic Time Warping (DTW) is a well-known dynamic programming technique used for comparing and aligning sequences of data. Sequence alignment methods are described extensively in the literature, such as in a book by Sankoff and Kruskal entitled “Time Warps, String Edits and Macromolecules: the Theory and Practice of Sequence Comparison,” Addison-Wesley Publishing Company, 1983, which is incorporated herein by reference. The authors describe different sequence matching techniques, including DTW, and their applications in a variety of fields. Applications range from computer science and mathematics, through DNA sequence matching in molecular biology, to voice recognition and even the study of bird song.
The terms DTW and sequence alignment are often used interchangeably in the literature. Typically, DTW refers to applications involving continuous data, such as voice recognition. Methods for aligning discrete data sequences (in which the data elements are selected from a discrete alphabet) such as DNA sequences, are often referred to as Levenshtein or Waterman-Smith methods.
The matrix computation involved in the various sequence alignment methods is sometimes implemented using systolic arrays. For example, U.S. Pat. No. 5,757,959, whose disclosure is incorporated herein by reference, describes a method for handwriting matching using a linear systolic array processor. The processor calculates an edit distance between an electronic handwritten pattern and a stored string.
Several methods for efficient sequence comparison are described in the patent literature. For example, U.S. Patent Application Publication 2004/0024536 A1, whose disclosure is incorporated herein by reference, describes a parallelization of the Smith-Waterman sequence alignment algorithm using parallel processing in the form of SIMD (Single-Instruction, Multiple-Data) technology. U.S. Patent Application Publication 2004/0098203 A1, whose disclosure is also incorporated herein by reference, describes a method for biological sequence alignment and database search. According to the described method, an optimal un-gapped alignment score of each diagonal in an alignment matrix is computed. A heuristic method for estimating a gapped alignment score is then employed. The estimate is used to identify a 1% fraction of the most interesting database sequences. These sequences are subsequently aligned with the query sequence using the Smith-Waterman method.
The methods described in the two patent application publications cited above have been implemented in a sequence database search tool called Paralign™, offered by Sencel™ Bioinformatics AS (Oslo, Norway).