The sequence-specific cleavage of double-helical deoxyribonucleic acid (hereafter "DNA") by naturally occurring restriction endonucleases is essential for many techniques in molecular biology, including gene isolation, DNA sequence determination, chromosome analysis, gene isolation and recombinant DNA manipulations. Other applications include the use of such endonucleases as diagnostic reagents to detect aberrant DNA sequences.
The usefulness of DNA cleavage by these naturally recurring restriction enzymes is limited. The binding site sizes of naturally occurring restriction enzymes are typically in the range of four to eight base pairs, and hence their sequence specificities may be inadequate for mapping genomes (10.sup.5 -10.sup.7 base pairs) over very large distances. For unique recognition of DNA in the 10.sup.5 -10.sup.7 base pair range, sequence specificities at the 8-15 base pair level must be obtained. In addition, there are a limited number of known restriction endonucleases. Thus, they cannot be used to specifically recognize a particular piece of DNA (or RNA) unless that piece of DNA contains the specific nucleic acid sequences recognized by particular endonuclease. With the advent of pulsed field gel electrophoresis, separation of large (up to at least one million base pair) pieces of DNA is now possible. The design and synthesis of molecules that are capable of recognizing a specific sequence in double-stranded nucleic acids not otherwise detectable by natural restriction enzymes is clearly desirable as valuable tools for further research, diagnostics, and therapeutic.
Synthetic sequence-specific binding moieties for double-helical DNA that have been studied are typically coupled analogs of natural products (Dervan, P. D., Science 232:464 (1986)), transition metal complexes (Barton, J. K., Science 233:727 (1986)), and peptide fragments derived from DNA binding proteins (Sluka, J. et al., Science, in press). Additionally, methidiumpropyl-EDTA (hereafter "MPE"), which contains the metal chelator ethylenediaminetetraacetic acid ("EDTA") attached to the DNA intercalator methidium, has been shown to cleave double-helical DNA efficiently in a reaction dependent on ferrous iron (Fe(II)) and dioxygen (O.sub.2). This mechanism is thought to occur by binding in the minor groove of the right-handed DNA helix. Addition of reducing agents such as dithiothreitol (hereafter "DTT") increases the efficiency of DNA cleavage, as reported by Hertzberg and Dervan, J. Am. Chem. Soc. 104:313-315 (1982); and Hertzberg and Dervan, Biochemistry 23:3934 (1984). MPE-Fe(II) cleaves DNA in a relatively non-sequence specific manner, and with significantly lower sequence specificity than the enzyme DNAseI, and therefore is useful in experiments to identify binding locations of small molecules such as antibiotics, other drugs, and proteins on DNA, Hertzberg and Dervan, Biochemistry, supra.
The most sequence-specific molecules characterized so far, with regard to the natural product analog approach is bis(EDTA-distamycin) fumaramide which binds in the minor grove and cleaves at sites containing nine contiguous A.T base-pairs (Youngquist and Dervan, J. Am. Chem. Soc. 107:5528 (1985)). A synthetic peptide containing 32 residues from the DNA binding domain of Hin protein with EDTA at the amino-terminus binds and cleaves at the 13 bp Hin site (Bruist, et al., Science 235:777 (1987); Sluka, et al., supra). Another known DNA cleaving function involves the attachment of a DNA-cleaving moiety such as a ethylenediaminetetraacetic acid-iron complex (hereafter "EDTA-Fe(II)"), to a DNA binding molecule which cleaves the DNA backbone by oxidation of the deoxyribose with a short-lived diffusible hydroxyl radical (Hertzberg and Dervan, Biochemistry, supra). The fact that the hydroxyl radical is a relatively non-specific cleaving species is useful when studying recognition, because the cleavage specificity is due to the binding moiety alone, not some combination of cleavage specificity superimposed on binding specificity.
Despite this progress, the current understanding of molecular recognition of DNA is still sufficiently primitive that the elucidation of chemical principles involved in creating specificity in sequence recognition at the .gtoreq.15 base pair level has been slow in development in comparison to the interest in the field for mapping large genomes.
Recognition of single-stranded nucleic acids by nucleic acid-hybridization probes consisting of sequences of DNA or RNA are well known in the art. Typically, to construct a DNA hybridization probe, selected target DNA is obtained as a single-strand and copies of a portion of the strand are synthesized in the laboratory and labeled using radioactive isotopes, fluorescing molecules, photolytic dyes or enzymes that react with a substrate to produce a color change. When exposed to complementary strands of target DNA, the labeled DNA probe binds to (hybridizes) its complementary single-stranded DNA sequence. The label on the probe is then detected and the DNA of interest is thus located. Probes may similarly be used to target RNA sequences. DNA probes are currently well known in the art for locating and selecting genes of known sequence, and in the diagnosis and chemotherapy of genetic disorders and diseases.
Oligonucleotides (polynucleotides containing between 10 and 50 bases) equipped with a DNA cleaving moiety have been described which produce sequence-specific cleavage of single-stranded DNA. Examples of such moieties include oligonucleotide-EDTA-Fe hybridization probes ("DNA-EDTA") which cleaves the complementary single strand sequence (Dreyer and Dervan, Proc. Natl. Acad. Sci. USA 82:968 (1985); Chu and Orgel, Proc. Natl. Acad. Sci. USA 82:963 (1985)). Such probes are disclosed in U.S. Pat. No. 4,795,700.
In addition to double- and single-stranded configurations, it is also well known in the art that triplexes of nucleic acids naturally exist (Howard, et al., Biochem. Biophys. Res. Commun. 17:93 (1964)). Poly(U) and poly(A) were found to form a stable 2:1 complex in the presence of MgCl.sub.2. After this, several triple-stranded structures were discovered (Michelson, et al., Prog. Nucl. Acid Res. Mol. Biol. 6:83 (1967); Felsenfeld and Miles, Annu. Rev. Biochem. 36:407 (1967)). Poly(C) forms a triple-stranded complex at pH 6.2 with guanineoligoribonucleotides. One of the pyrimidine strands is apparently in the protonated form (Howard, et al., supra). In principle, isomorphous base triplets (T-A-T and C-G-C.sup.+) can be formed between any homopyrimidine-homopurine duplex and a corresponding homopyrimidine strand (Miller and Sobell, Proc. Natl. Acad. Sci. USA 55:1201 (1966); Morgan and Wells, J. Mol. Biol. 37:63 (1968); Lee, et al., Nucl. Acids Res. 6:3073 (1979)). The DNA-duplex poly(dTdC)-poly(dG-dA) associates with poly(U-C) or poly(dTdC) below pH 6 in the presence of MgCl.sub.2 to afford a triple-stranded complex. Several investigators have proposed an anti-parallel orientation of the two polypyrimidine strands based on an anti conformation of the bases, ibid. X-ray defraction patterns of triple-stranded fibers (poly (A)-2poly(U) and poly(dA)-2poly-(dT)) supports this hypothesis (Arnott and Bond, Nature New Biology 244:99 1973); Arnott and Selsing, J. Mol. Biol. 85:509 (1974); and Arnott, et al., Nucl. Acids Res. 3:2459 (1976)), and suggested an A'-RNA-like conformation of the two Watson-Crick base paired strands with the third strand in the same conformation, bound parallel to the homopurine strand of the duplex by Hoogsteen-hydrogen bonds (Hoogsteen, Acta Cry St. 12:822 (1959)). The twelve-fold helix with dislocation of the axis by almost three angstroms, the C3'-endo sugar puckering and small base-tilts result in a large and deep major groove that is capable of accommodating the third strand (Saenger, Principles of Nucleic Acid Structure, edited by C. R. Cantor, Springer-Verlag, New York, Inc. (1984)).
Although triple-stranded structures of polynucleotides were discovered decades ago, the biological significance has remained obscure. Such triplexes were proposed to be involved in processes such as regulation of gene expression, maintenance of folded chromosome conformations, chromosome condensation during mitosis, and induction of local conformational changes in B-DNA (Morgan, Trends Biochem. Sci. 4:N244 (1979); Hopkins, Comments Mol. Cell Biophys. 2:133 (1984); Minton, J. Path. 2:135 (1985)).
The above-described methods for sequence-specific DNA recognition and cleavage have been limited to single-stranded DNA hybridization probes, to natural or synthetic restriction endonucleases, and to those molecules which recognize sequences of DNA directly such as antibiotics, and DNA intercalators such as methidium.