Identifying genome sequence variations between individuals is valued because it has the potential to explain phenotypes, disease predisposition, response to disease treatments and mechanisms of disease which may lead to more effective drug development. Sequencing the whole genomes of individuals is still an expensive and time consuming procedure although recent technologies based on sequencing-by-synthesis have made great progress. More economical, efficient and error-free methods of identifying sequence variation are needed.
The activity of mismatch-specific endonucleases has found utility in detection of sequence variations in otherwise identical DNA strands. A heteroduplex must first be produced between one reference DNA strand and the complement of the sample strand. If the sequences are exactly complementary with no mismatched bases, no cleavage takes place. If, however, they are not exactly complementary and mismatched bases are present, cleavage takes place with high specificity at the sites of mismatch. The cleaved products are conventionally detected by separation technologies based on fragment size. Thus, not only is the presence of mismatches revealed, the approximate location of the mismatch can also be inferred. For final identification of the exact nature of the difference between the sequences, Sanger sequencing is conventionally employed. Mismatch-specific endonuclease activity can therefore be used in an effective screening method capable of discriminating between samples which do and do not require additional sequence analysis. Such a screening process can reduce the numbers of samples that need to be fully sequenced saving both time, money and the processing, analysis & storage if excessive quantities if data.
CEL I and CEL II DNA endonucleases are examples of endonucleases which are known to cut double-stranded DNA in both strands at sites of single-base substitution, insertion or deletion. These enzymes cleave DNA on the 3′-side of the mismatch site, generating single-stranded 3′-overhangs of one or more nucleotides (Oleykowski, et al. (1998) Nucleic Acids Res. 26:4597-4602; Yang, et al. (2000) Biochemistry 39:3533-3541; Sokurenko, et al. (2001) Nucleic Acids Res. 29:e111; Qiu, et al. (2004) BioTechniques 37:702-707). These enzymes are routinely used to detect and map the location of unknown mutations in PCR-amplified DNA fragments (see, e.g., Kuliski, et al. (2000) Biotechniques 29:44-48; Colbert, et al. (2001) Plant Physiol. 126:480-484; Till, et al. (2003) Genome Res. 13:524-530; Greene, et al. (2003) Genetics 164:731-740; Slade, et al. (2005) Nat. Biotechniques 23:75-81). Detection and mapping involves PCR amplification of target DNA, annealing of the amplified DNA to form a mixture of homoduplices and heteroduplices, digestion with mismatch-specific endonuclease, and fractionation of the undigested homoduplices and heteroduplices from the digested products on a platform that separates DNA fragments based upon size. Endonuclease mutation detection and mapping, however, does not reveal the identity of a mutation, which requires DNA sequencing. Therefore, there is a need in the art to simplify mutation detection and concurrently determine the nature of the variation. The present invention meets this need in the art.