The sequence-specific cleavage of double-helical deoxyribonucleic acid (hereafter xe2x80x9cDNAxe2x80x9d) by naturally occurring restriction endonucleases is essential for many techniques in molecular biology, including gene isolation, DNA sequence determination, chromosome analysis, gene isolation and recombinant DNA manipulations. Other applications include the use of such endonucleases as diagnostic reagents to detect aberrant DNA sequences.
The usefulness of DNA cleavage by these naturally recurring restriction enzymes is limited. The binding site sizes of naturally occurring restriction enzymes are typically in the range of four to eight base pairs, and hence their sequence specificities may be inadequate for mapping genomes (105-107 base pairs) over very large distances. For unique recognition of DNA in the 105-107 base pair range, sequence specificities at the 8-15 base pair level must be obtained. In addition, there are a limited number of known restriction endonucleases. Thus, they cannot be used to specifically recognize a particular piece of DNA (or RNA) unless that piece of DNA contains the specific nucleic acid sequences recognized by particular endonuclease. With the advent of pulsed field gel electrophoresis, separation of large (up to at least one million base pair) pieces of DNA is now possible. The design and synthesis of molecules that are capable of recognizing a specific sequence in double-stranded nucleic acids not otherwise detectable by natural restriction enzymes is clearly desirable as valuable tools for further research, diagnostics, and therapeutic.
Synthetic sequence-specific binding moieties for double-helical DNA that have been studied are typically coupled analogs of natural products (Dervan, P. D., Science 232:464 (1986)), transition metal complexes (Barton, J. K., Science 233:727 (1986)), and peptide fragments derived from DNA binding proteins (Sluka, J. et al., Science, in press). Additionally, methidiumpropyl-EDTA (hereafter xe2x80x9cMPExe2x80x9d), which contains the metal chelator ethylenediaminetetraacetic acid (xe2x80x9cEDTAxe2x80x9d) attached to the DNA intercalator methidium, has been shown to cleave double-helical DNA efficiently in a reaction dependent on ferrous iron (Fe(II)) and dioxygen (O2) This mechanism is thought to occur by binding in the minor groove of the right-handed DNA helix. Addition of reducing agents such as dithiothreitol (hereafter xe2x80x9cDTTxe2x80x9d) increases the efficiency of DNA cleavage, as reported by Hertzberg and Dervan, J. Am. Chem. Soc. 104:313-315 (1982); and Hertzberg and Dervan, Biochemistry 23:3934 (1984). MPE-Fe(II) cleaves DNA in a relatively non-sequence specific manner, and with significantly lower sequence specificity than the enzyme DNAseI, and therefore is useful in experiments to identify binding locations of small molecules such as antibiotics, other drugs, and proteins on DNA, Hertzberg and Dervan, Biochemistry, supra.
The most sequence-specific molecules characterized so far, with regard to the natural product analog approach is bis(EDTA-distamycin) fumaramide which binds in the minor grove and cleaves at sites containing nine contiguous A.T base-pairs (Youngquist and Dervan, J. Am. Chem. Soc. 107:5528 (1985)). A synthetic peptide containing 32 residues from the DNA binding domain of Hin protein with EDTA at the amino-terminus binds and cleaves at the 13 bp Hin site (Bruist, et al., Science 235:777 (1987); Sluka, et al., supra). Another known DNA cleaving function involves the attachment of a DNA-cleaving moiety such as a ethylenediaminetetraacetic acid-iron complex (hereafter xe2x80x9cEDTA-Fe(II)xe2x80x9d), to a DNA binding molecule which cleaves the DNA backbone by oxidation of the deoxyribose with a short-lived diffusible hydroxyl radical (Hertzberg and Dervan, Biochemistry, supra). The fact that the hydroxyl radical is a relatively non-specific cleaving species is useful when studying recognition, because the cleavage specificity is due to the binding moiety alone, not some combination of cleavage specificity superimposed on binding specificity.
Despite this progress, the current understanding of molecular recognition of DNA is still sufficiently primitive that the elucidation of chemical principles involved in creating specificity in sequence recognition at the xe2x89xa715 base pair level has been slow in development in comparison to the interest in the field for mapping large genomes.
Recognition of single-stranded nucleic acids by nucleic acid-hybridization probes consisting of sequences of DNA or RNA are well known in the art. Typically, to construct a DNA hybridization probe, selected target DNA is obtained as a single-strand and copies of a portion of the strand are synthesized in the laboratory and labeled using radioactive isotopes, fluorescing molecules, photolytic dyes or enzymes that react with a substrate to produce a color change. When exposed to complementary strands of target DNA, the labeled DNA probe binds to (hybridizes) its complementary single-stranded DNA sequence. The label on the probe is then detected and the DNA of interest is thus located. Probes may similarly be used to target RNA sequences. DNA probes are currently well known in the art for locating and selecting genes of known sequence, and in the diagnosis and chemotherapy of genetic disorders and diseases.
Oligonucleotides (polynucleotides containing between 10 and 50 bases) equipped with a DNA cleaving moiety have been described which produce sequence-specific cleavage of single-stranded DNA. Examples of such moieties include oligonucleotide-EDTA-Fe hybridization probes (xe2x80x9cDNA-EDTAxe2x80x9d) which cleaves the complementary single strand sequence (Dreyer and Dervan, Proc. Natl. Acad. Sci. USA 82:968 (1985); Chu and Orgel, Proc. Natl. Acad. Sci. USA 82:963 (1985)). Such probes are disclosed in U.S. Pat. No. 4,795,700.
In addition to double- and single-stranded configurations, it is also well known in the art that triplexes of nucleic acids naturally exist (Howard, et al., Biochem. Biophys. Res. Commun. 17:93 (1964)). Poly(U) and poly(A) were found to form a stable 2:1 complex in the presence of MgCl2. After this, several triple-stranded structures were discovered (Michelson, et al., Prog. Nucl. Acid Res. Mol. Biol. 6:83 (1967); Felsenfeld and Miles, Annu. Rev. Biochem. 36:407 (1967)). Poly(C) forms a triple-stranded complex at pH 6.2 with guanineoligoribonucleotides. One of the pyrimidine strands is apparently in the protonated form (Howard, et al., supra). In principle, isomorphous base triplets (T-A-T and C-G-C+) can be formed between any homopyrimidine-homopurine duplex and a corresponding homopyrimidine strand (Miller and Sobell, Proc. Natl. Acad. Sci. USA 55:1201 (1966); Morgan and Wells, J. Mol. Biol. 37:63 (1968); Lee, et al., Nucl. Acids Res. 6:3073 (1979)). The DNA-duplex poly(dTdC)-poly(dG-dA) associates with poly(U-C) or poly(dTdC) below pH 6 in the presence of MgCl2to afford a triple-stranded complex. Several investigators have proposed an anti-parallel orientation of the two polypyrimidine strands based on an anti conformation of the bases, ibid. X-ray defraction patterns of triple-stranded fibers (poly(A)-2poly(U) and poly(dA)-2poly-(dT)) supports this hypothesis (Arnott and Bond, Nature New Biology 244:99 1973); Arnott and Selsing, J. Mol. Biol. 85:509 (1974); and Arnott, et al., Nucl. Acids Res. 3:2459 (1976)), and suggested an Axe2x80x2-RNA-like conformation of the two Watson-Crick base paired strands with the third strand in the same conformation, bound parallel to the homopurine strand of the duplex by Hoogsteen-hydrogen bonds (Hoogsteen, Acta Cry St. 12:822 (1959)). The twelve-fold helix with dislocation of the axis by almost three angstroms, the C3xe2x80x2-endo sugar puckering and small base-tilts result in a large and deep major groove that is capable of accommodating the third strand (Saenger, Principles Of Nucleic Acid Structure, edited by C. R. Cantor, Springer-Verlag, New York, Inc. (1984)).
Although triple-stranded structures of polynucleotides were discovered decades ago, the biological significance has remained obscure. Such triplexes were proposed to be involved in processes such as regulation of gene expression, maintenance of folded chromosome conformations, chromosome condensation during mitosis, and induction of local conformational changes in B-DNA (Morgan, Trends Biochem. Sci. 4:N244 (1979); Hopkins, Comments Mol. Cell Biophys. 2:133 (1984); Minton, J. Path. 2:135 (1985)).
The above-described methods for sequence-specific DNA recognition and cleavage have been limited to single-stranded DNA hybridization probes, to natural or synthetic restriction endonucleases, and to those molecules which recognize sequences of DNA directly such as antibiotics, and DNA intercalators such as methidium.
Based upon the above described limitations in the recognition of specific sequences in nucleic acids, it an object herein to provide compositions and methods to detect target sequences within large double-helical nucleic acids without the need to denature such double-helical molecules.
In accordance with these and other objects, the present invention includes improved triple-helices, synthetic oligonucleotides and methods using such oligonucleotides to form triple-helices.
The invention provides improved triple-helices and oligonucleotides wherein one of the strands of the double helical nucleic acid contains a first purine-rich target sequence and a pyrimidine-rich sequence. The target sequence comprises the first purine-rich target sequence and a second purine-rich target sequence on the other strand of the double-helical DNA which is base-paired with the pyrimidine-rich sequence. The oligonucleotide used to form this alternate strand triple-helix comprises two binding domains. The first domain comprises a pyrimidine-rich portion which binds to the first purine-rich target sequence in a parallel orientation. The second binding domain comprises a purine-rich portion which binds to the second purine-rich target sequence in an antiparallel orientation. The general rules for parallel and antiparallel orientation and uses of non-natural nucleotides are applicable for each of the binding domains used in the triple-helix forming oligonucleotide. The two binding domains in the oligonucleotide have the same 5xe2x80x2 to 3xe2x80x2 orientation such that one end of the oligonucleotide comprises a 5xe2x80x2 nucleotide and the other end comprises a 3xe2x80x2 nucleotide. When the pyrmidine-rich sequence in one strand is adjacent and 5xe2x80x2 to the first purine-rich sequence, a linking domain between the first and second binding domains of the oligonucleotide facilitates triple-helix formation.
The triple-helix forming oligonucleotide and the triple-helix containing it can optionally contain a nucleotide to which at least one moiety is attached. Such a moiety can be a detection moiety so as to permit detection of alternate strand triple-helix formation, a cleaving moiety capable of cleaving the double-helical nucleic acid to localize the site of triple-helix formation or a therapeutic agent wherein triple-helix formation targets the action of the therapeutic agent.
In addition, the invention includes processes for forming the above triple-helices wherein an oligonucleotide capable of forming an alternate strand triple-helix is contacted with a large double-helical nucleic acid to form the alternate strand triple-helix.
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate some of the embodiments of the invention and, together with the description, serve to explain the principles of the invention.