The present invention is directed to oligomeric compounds and their constituent monomers, especially peptide nucleic acid (PNA) oligomers and monomers. The peptide nucleic acid oligomers are useful for forming triple helix (triplex) structures with nucleic acids with increased binding specificity. In one aspect of the present invention novel PNA oligomers have increased specificity for thymidine and deoxyuridine in triplex structures.
Peptide nucleic acids are useful surrogates for oligonucleotides in binding to both DNA and RNA. See Egholm et al., Nature, 1993, 365, 566-568 and references cited therein).
PNA binds both DNA and RNA to form PNA/DNA or PNA/RNA duplexes. The resulting PNA/DNA or PNA/RNA duplexes are bound with greater affinity than corresponding DNA/DNA or DNA/RNA duplexes as evidence by their higher melting temperatures (Tm). This high thermal stability has been attributed to the neutrality of the PNA backbone, which does not encounter the charge repulsion present in DNA or RNA duplexes. The neutral backbone of the PNA also renders the Tms of PNA/DNA(RNA) duplexes practically independent of salt concentration. Thus the PNA/DNA duplex offers a further advantage over DNA/DNA duplex interactions which are highly dependent on ionic strength. Homopyrimidine PNAs have been shown to bind complementary DNA or RNA forming (PNA)2/DNA(RNA) triplexes of high thermal stability (see, e.g., Nielsen, et al., Science, 1991, 254, 1497; Egholm, et al., J. Am. Chem. Soc., 1992, 114, 1895; Egholm, et al., J. Am. Chem. Soc., 1992, 114, 9677).
In addition to increased affinity, PNA has also been shown to bind to DNA with increased specificity. When a PNA/DNA duplex mismatch is melted relative to the DNA/DNA duplex there is seen an 8 to 20xc2x0 C. drop in the Tm. This magnitude of a drop in Tm is not seen with the corresponding DNA/DNA duplex with a mismatch present. See Egholm, M., et al., Nature 1993 365 p. 566.
The binding of a PNA strand to a DNA or RNA strand can occur in one of two orientations. The orientation is said to be anti-parallel when the DNA or RNA strand in a 5xe2x80x2 to 3xe2x80x2 orientation binds to the complementary PNA strand such that the carboxyl end of the PNA is directed towards the 5xe2x80x2 end of the DNA or RNA and amino end of the PNA is directed towards the 3xe2x80x2 end of the DNA or RNA. In the parallel orientation the carboxyl end and amino end of the PNA are in reverse orientation with respect to the 5xe2x80x2-3xe2x80x2 direction of the DNA or RNA.
Because of their properties, PNAs are known to be useful in several different applications. In particular, PNAs have been used to form duplexes and triplexes with complementary RNA or DNA (see e.g., Knudsen et al., Nucleic Acids Res., 1996, 24, 494-500; and Nielsen et al., J. Am. Chem. Soc., 1996, 118, 2287-2288). Additionally, several review articles have recently been published in this area. See e.g., Hyrup et al., Bioorganic and Med. Chem., 1996, 4, 5-23; Nielsen, xe2x80x9cPeptide nucleic acid (PNA): A lead for gene therapeutic drugs,xe2x80x9d in Trainor (Ed.), Perspectives Drug Disc. Des., 1996, 4, 76-84.
Since PNAs have stronger binding and greater specificity than oligonucleotides, they are of great utility as probes in cloning, blotting procedures, and in applications such as fluorescence in situ hybridization (FISH). Homopyrimidine PNAs are used for strand displacement in homopurine targets. The local triplex inhibits gene transcription. Additionally, the restriction sites that overlap with or are adjacent to the D-loop will not be cleaved by restriction enzymes. The binding of PNAs to specific restriction sites within a DNA fragment can inhibit cleavage at those sites. Such inhibition is useful in cloning and subcloning procedures. Labeled PNAs are also used to directly map DNA molecules by hybridizing PNA molecules having a fluorescent or other type of detectable label to complementary sequences in duplex DNA using strand invasion.
PNAs also have been used to detect point mutations in PCR-based assays (PCR clamping). In PCR clamping, PNA is used to detect point mutations in a PCR-based assay, e.g. the distinction between a common wild type allele and a mutant allele, in a segment of DNA under investigation. Typically, a PNA oligomer complementary to the wild type sequence is synthesized and included in the PCR reaction mixture with two DNA primers, one of which is complementary to the mutant sequence. The wild type PNA oligomer and the DNA primer compete for hybridization to the target. Hybridization of the DNA primer and subsequent amplification will only occur if the target is a mutant allele. With this method, the presence and exact identity of a mutant can be determined.
Considerable research is being directed to the application of oligonucleotides and oligonucleotide analogs that bind complementary DNA and RNA strands for use as diagnostics, research reagents and potential therapeutics. For many uses, the oligonucleotides and oligonucleotide analogs must be transported across cell membranes or taken up by cells to express activity.
PCT/EP/01219 describes novel peptide nucleic acid (PNA) compounds which bind complementary DNA and RNA more tightly than the corresponding DNA. It is desirable to append to these compounds groups which modulate or otherwise influence their activity or their membrane or cellular transport. One method for increasing such transport is by the attachment of a pendant lipophilic group.
The synthesis of peptide nucleic acids via preformed monomers has been described in International patent applications WO 92/20702 and WO 92/20703, the contents of each of which are incorporated herein by reference in their entirety. Recent advances have also been reported on the synthesis, structure, biological properties, and uses of PNAs. See for example WO 93/12129 and U.S. Pat. No. 5,539,083 to Cook et al., Egholm et al., Nature, 1993, 365, 566-568, Nielsen et al., Science, 1991, 254, 1497-1500; and Egholm et al., J. Am. Chem. Soc., 1992, 114, 1895-1897. Peptide nucleic acids also have been demonstrated to effect strand displacement of double stranded DNA (see Patel, D. J., Nature, 1993, 365, 490-492). The contents of each of the foregoing patents and publications are incorporated herein by reference in their entirety.
Triple helix formation by oligonucleotides has been an area of intense investigation since sequence-specific cleavage of double-stranded deoxyribonucleic acid (DNA) was demonstrated by Moser et al., Science, 1987, 238, 645-650. Triplex-forming oligonucleotides are believed to be of potential use in gene therapy, diagnostic probing, and other biomedical applications. See e.g., Uhlmann et al., Chemical Reviews, 1990, 90, 543-584.
Pyrimidine oligonucleotides have been shown to form triple helix structures through binding to homopurine targets in double-stranded DNA. In these structures the new pyrimidine strand is oriented parallel to the purine Watson-Crick strand in the major groove of the DNA, and binds through sequence-specific Hoogsteen hydrogen bonds. The sequence-specificity is derived from thymine recognizing adenine (T:A-T) and protonated cytosine recognizing guanine (C+:G-C). See Best et al., J. Am. Chem. Soc., 1995, 117, 1187-1193). In a less well-studied triplex motif, purine-rich oligonucleotides bind to purine targets of double-stranded DNA. The orientation of the third strand in this motif is anti-parallel to the purine Watson-Crick strand, and the specificity is derived from guanine recognizing guanine (G:G-C) and thymine or adenine recognizing adenine (A:A-T or T:A-T). See Greenberg et al., J. Am. Chem. Soc., 1995, 117, 5016-5022.
Homopyrimidine PNAs form highly stable PNA:DNA-PNA complexes with complementary oligonucleotides. The formation of triple helix structures involving two PNA strands and one nucleotide strand has been previously reported in U.S. patent application Ser. No. 08/088,661, filed Jul. 2, 1993, entitled Double-Stranded Peptide Nucleic Acids, the contents of which are incorporated herein by reference in their entirety. The formation of triplexes in which the Hoogsteen strand is parallel to the DNA purine target strand is preferred to formation of anti-parallel complexes. This allows for the use of bis-PNAs to obtain triple helix structures with increased pH-independent thermal stability using pseudoisocytosine instead of cytosine in the Hoogsteen strand. See, Egholm et al., J. Am. Chem. Soc., 1992, 114, 1895-1897, also see Published PCT application WO 96/02558 the entire contents of each of which are incorporated herein by reference.
Peptide nucleic acids have been shown to have higher binding affinities (as determined by their melting temperatures) for both DNA and RNA than that of DNA or RNA to either DNA or RNA. This increase in binding affinity makes these peptide nucleic acid oligomers especially useful as molecular probes and diagnostic agents for nucleic acid species.
It has been previously shown that a carbazole-like 2xe2x80x2-deoxycytidine analog incorporated into oligonucleotides will pair specifically with guanine in complementary RNA in a duplex motif (U.S. Pat. No. 5,502,177, issued Mar. 26, 1996, entitled Pyrimidine Derivatives for Labeled Binding Partners; Matteucci, M. D., von Krosigk, U., Tetrahetron Letters, 1996, 37, 5057-5060; Kuei-Ying, L., et al., J. Am. Chem., 1995, 117, 3873-3874).
The current limitations in the formation of triplex structures (such as the limitation to homopurine targets) is one of the major difficulties for sequence-specific recognition of defined sites of DNA by peptide nucleic acids. See Nielsen, J. Am. Chem. Soc., 1996, 118, 2287-2288. Accordingly, there is a need for new PNA oligomers containing nucleobase-binding moieties that can bind Watson-Crick base pairs, preferentially within the pyrimidine triple helix motif.
Provided in accordance with the present invention are oligomeric compounds, particularly peptide nucleic acids, comprising a moiety having the Formula I: 
wherein:
L is an adenosine-thymidine nucleobase pair recognition moiety;
A is a single bond, a methylene group or a group of formula: 
xe2x80x83where:
X is O, S, Se, NR3, CH2 or C(CH3)2;
Y is a single bond, O, S or NR4;
each p, q, r and s is, independently, zero or an integer from 1 to 5;
each R1 and R2 is, independently, selected from the group consisting of hydrogen, (C1-C4)alkyl which may be hydroxy- or alkoxy- or alkylthio-substituted, hydroxy, alkoxy, alkylthio, amino and halogen; and
each R3 and R4 is, independently, selected from the group consisting of hydrogen, (C1-C4)alkyl, hydroxy- or alkoxy- or alkylthio-substituted (C1-C4)alkyl, hydroxy, alkoxy, alkylthio and amino;
B is N or R3xe2x80x94N+, where R3 is as defined above;
E is CR6R7, CHR6CHR7 or CR6R7CH2, where R6 is hydrogen and R7 is selected from the group consisting of the side chains of naturally occurring alpha amino acids, or R6 and R7 are independently selected from the group consisting of hydrogen, (C2-C6)alkyl, aryl, aralkyl, heteroaryl, hydroxy, (C1-C6)alkoxy, (C1-C6)alkylthio, NR3R4 and SR5, where R3 and R4 are as defined above, and R5 is hydrogen, (C1-C6)alkyl, hydroxy-, alkoxy-, or alkylthio-substituted (C1-C6)alkyl, or R6 and R7 taken together complete an alicyclic or heterocyclic system;
D is CR6R7, CH2CR6R7 or CHR6CHR7, where R6 and R7 are as defined above; and
G is xe2x80x94NR3COxe2x80x94, xe2x80x94NR3CSxe2x80x94, xe2x80x94NR3SOxe2x80x94 or xe2x80x94NR3SO2xe2x80x94, in either orientation, where R3 is as defined above.
In preferred embodiments, the monomeric unit has the Formula II: 
wherein:
R8 is H, COCH3 or an amino protecting group;
R9 is hydrogen or a side chain of a naturally occurring amino acid;
R10 is O, NH, O-alkylene or a lysine residue;
W is xe2x80x94(CH2)mxe2x80x94 where m is from 0 to about 6, or 
where b is an integer from 0 to 4;
k is from 0 to about 5;
n is 0 or 1;
L has the formula 
Q is CH or N;
R17 is H or C1-C8 alkyl;
each R11 and R12 is, independently, H, C1-C8 alkyl, or halogen;
or R11 and R12 together with the carbon atoms to which they are attached form a phenyl group;
T has the formula: 
j and z are each, independently, from 0 to about 5 with the sum of j and z being from 1 to 7;
M is C(xe2x95x90O), S(O)2, phenyl or P(O)2;
V is NH, S, or CH2; and
a, h and g are each independently 0 or 1.
Also provided in accordance with the present invention are monomeric compounds having the Formula III: 
wherein:
L, A, B, D and E have the meaning described above, and each F is, independently, NHR3 or NPgR3, where R3 is as defined above, and Pg is an amino protecting group.
In preferred embodiments, the monomeric compounds of the invention have the Formula IV: 
wherein:
R8, R9, T, L, W, k and n have the meaning described above, and R13 and R14 are each independently H or a protecting group.
In some preferred embodiments of the compounds of the invention, g and h are each 0. In more preferred embodiments g and h are each 0, and a is 0. In further preferred embodiments a is 0, g is 0, X is NH and h is 1.
In some preferred embodiments, L one of the formulas: 
In some preferred embodiments R4 and R5 are each H. In further preferred embodiments R4 and R5 together with the atoms to which they are attached from a phenyl ring.
In some preferred embodiments Q is N; and in other preferred embodiments Q is CH.
Preferably T is lower alkyl or alkylamino. In especially preferred embodiments T is xe2x80x94CH2xe2x80x94CH2xe2x80x94NHxe2x80x94, xe2x80x94CH2xe2x80x94, xe2x80x94CH2xe2x80x94CH2xe2x80x94, xe2x80x94Oxe2x80x94CH2xe2x80x94CH2xe2x80x94, xe2x80x94Oxe2x80x94CH2xe2x80x94CH2xe2x80x94CH2xe2x80x94, xe2x80x94(CH2)mxe2x80x94.
In other preferred embodiments W has the formula: 
where b is preferably an integer between 0 and 4, with 2 and 3 being particularly preferred. In further preferred embodiments at least one of Cxcex1 or Cxcex2 is in the S configuration.
In some preferred embodiments, the compounds of the invention are peptide nucleic acids. In other preferred embodiments the compounds of the invention comprise a plurality of peptide nucleic acid oligomers, preferably 2 oligomers, that are linked by linking groups, wherein at least one of the peptide nucleic acid oligomers comprises a moiety having Formula II. In particularly preferred embodiments two peptide nucleic acid oligomers are linked by a linking moiety, which is preferably one or more 8-amino-3,6-dioxaoctanoic acid groups and more preferable three 8-amino-3,6-dioxaoctanoic acid groups.
Some particularly preferred embodiments of the compounds of the invention have the formula: 
R15 is OH, a protected hydroxyl group, or a protecting group; and
R16 is H or an amino protecting group.
L has the formula: 
Q is CH or N;
R17 is H or C1-C8 alkyl;
each R11 and R12 is, independently, H, C1-C8 alkyl, or halogen;
or R11 and R12 together with the carbon atoms to which they are attached form a phenyl group;
A is a single bond, a methylene group or a group of formula: 
xe2x80x83where:
X is O, S, Se, NR3, CH2 or C(CH3)2;
Y is a single bond, O, S or NR4;
each p, q, r and s is, independently, zero or an integer from 1 to 5;
each R1 and R2 is, independently, selected from the group consisting of hydrogen, (C1-C4)alkyl which may be hydroxy- or alkoxy- or alkylthio-substituted, hydroxy, alkoxy, alkylthio, amino and halogen; and
each R3 and R4 is, independently, selected from the group consisting of hydrogen, (C1-C4)alkyl, hydroxy- or alkoxy- or alkylthio-substituted (C1-C4)alkyl, hydroxy, alkoxy, alkylthio and amino.
Also provided in accordance with the present invention are compositions, preferably triplex compounds, comprising a single stranded DNA coding for a sequence suspected of being implicated in a disease state and containing one or more thymine residues; a first peptide nucleic acid oligomer that comprises a region that is complementary to a region of the single stranded nucleic acid; and a second peptide nucleic acid oligomer comprising a sequence that is complementary to a region of the single stranded nucleic acid, the second peptide nucleic acid oligomer having at one or more positions complementary to the thymine residues of the single stranded nucleic acid a residue having a non purine nucleobase, preferably a residue of Formula II.
The present invention also provides methods for forming a triplex compound comprising the steps of:
(a) selecting a single stranded nucleic acid containing one or more thymine residues;
(b) providing a first oligomer that comprises a region that is complementary to a region of the single stranded nucleic acid;
(c) contacting the single stranded nucleic acid and the first oligomer with a second oligomer, where the second oligomer is a peptide nucleic acid oligomer comprising a sequence that is complementary to a region of the single stranded nucleic acid and has at one or more positions complementary to the thymine residues of the single stranded nucleic acid a residue of Formula II, for a time and under conditions effective to form the triple helix compound. Preferably, the first oligomer is PNA or DNA.
In some preferred embodiments of the methods of the invention the first oligomer is oriented antiparallel to the single stranded nucleic acid, and the second oligomer is oriented parallel to the single stranded nucleic acid in the triplex compound. In particularly preferred embodiments the triplex compound has the formula PNA-DNA-PNA.
Preferably, the single stranded nucleic acid is DNA or RNA.
In further preferred embodiments the first oligomer is a peptide nucleic acid, and the first oligomer is linked to the second oligomer by a linking moiety.
In some preferred embodiments of the methods of the invention the first and second oligomers are each from 4 to about 20 nucleobases in length.
The present invention also provides methods for the detection of a chemical or microbiological entity which contains a known nucleobase sequence comprising:
selecting a nucleobase sequence from the chemical or microbiological entity which contains one or more thymine residues;
providing a PNA oligomer that contains a region that complementary to the selected nucleobase sequence;
contacting the selected nucleobase sequence of the chemical or microbiological entity and the complementary PNA oligomer with a further peptide nucleic acid oligomer which contains a sequence that is complementary to the selected nucleobase sequence, where the further peptide nucleic acid oligomer has at one or more positions complementary to the thymine residues of the selected nucleobase sequence a residue of Formula II, to form a triple helix compound; and
detecting the triple helix compound.
Methods are also provided for the sequence-specific recognition of a double-stranded polynucleotide, comprising contacting the polynucleotide with a compound having a residue of Formula II.
Methods are also provided for the sequence-specific recognition of a double-stranded polynucleotide, comprising contacting the polynucleotide with an oligomeric compound that binds to the polynucleotide to form a triplex structure, wherein the oligomeric compound comprises a monomeric unit having Formula I, more preferably Formula II.
In one aspect, the present invention provides novel oligomeric compounds, especially peptide nucleic acids, that are useful as research reagents, and as specific probes for complementary nucleic acid. The present invention also provides monomeric synthons useful in the preparation of the oligomeric compounds.
In preferred embodiments the compounds of the invention contain a moiety of Formula I: 
wherein:
L is an adenosine-thymidine nucleobase pair recognition moiety;
A is a single bond, a methylene group or a group of formula: 
xe2x80x83where:
X is O, S, Se, NR3, CH2 or C(CH3)2;
Y is a single bond, O, S or NR4;
each p, q, r and s is, independently, zero or an integer from 1 to 5;
each R1 and R2 is, independently, selected from the group consisting of hydrogen, (C1-C4)alkyl which may be hydroxy- or alkoxy- or alkylthio-substituted, hydroxy, alkoxy, alkylthio, amino and halogen; and
each R3 and R4 is, independently, selected from the group consisting of hydrogen, (C1-C4)alkyl, hydroxy- or alkoxy- or alkylthio-substituted (C1-C4)alkyl, hydroxy, alkoxy, alkylthio and amino;
B is N or R3xe2x80x94N+, where R3 is as defined above;
E is CR6R7, CHR6CHR7 or CR6R7CH2, where R6 is hydrogen and R7 is selected from the group consisting of the side chains of naturally occurring alpha amino acids, or R6 and R7 are independently selected from the group consisting of hydrogen, (C2-C6)alkyl, aryl, aralkyl, heteroaryl, hydroxy, (C1-C6)alkoxy, (C1-C6)alkylthio, NR3R4 and SR5, where R3 and R4 are as defined above, and R5 is hydrogen, (C1-C6)alkyl, hydroxy-, alkoxy-, or alkylthio-substituted (C1-C6)alkyl, or R6 and R7 taken together complete an alicyclic or heterocyclic system;
D is CR6R7, CH2CR6R7 or CHR6CHR7, where R6 and R7 are as defined above; and
G is xe2x80x94NR3COxe2x80x94, xe2x80x94NR3CSxe2x80x94, xe2x80x94NR3SOxe2x80x94 or xe2x80x94NR3SO2xe2x80x94, in either orientation, where R3 is as defined above.
In more preferred embodiments the compounds of the present invention contain a moiety of Formula II: 
wherein:
R8 is H, COCH3 or an amino protecting group;
R9 is hydrogen or a side chain of a naturally occurring amino acid;
R10 is O, NH, O-alkylene or a lysine residue;
W is xe2x80x94(CH2)mxe2x80x94 where m is from 0 to about 6, or 
where b is an integer from 0 to 4;
k is from 0 to about 5;
n is 0 or 1;
L has the formula 
Q is CH or N;
R17 is H or C1-C8 alkyl;
each R11 and R12 is, independently, H, C1-C8 alkyl, or halogen;
or R11 and R12 together with the carbon atoms to which they are attached form a phenyl group;
T has the formula: 
j and z are each, independently, from 0 to about 5 with the sum of j and z being from 1 to 7;
M is C(xe2x95x90O), S(O)2, phenyl or P(O)2;
V is NH, S, or CH2; and
a, h and g are each independently 0 or 1.
Preferred embodiments of the compounds of the invention include oligomeric compounds that contain one or more moieties of Formula II. There can be as few as one moiety of Formula II in the oligomer, or the majority of monomeric units in the oligomer can be moieties of Formula II.
Further preferred embodiments of the compounds of the invention include two PNA oligomers that are linked together by one or more linking moieties, wherein one or both of the PNA oligomers contain at least one moiety of Formula II (xe2x80x9cbis-PNA oligomersxe2x80x9d). The present invention also includes higher order linked PNA oligomers, wherein a plurality of PNA oligomers are linked by linking moieties, wherein one or more of the linked PNA oligomers contain at least one moiety of Formula II.
As used herein, the term xe2x80x9cpeptide nucleic acidxe2x80x9d (PNA) refers to compounds that in some respects are analogous to oligonucleotides, but which differ in structure. In peptide nucleic acids, the deoxyribose backbone of oligonucleotides has been replaced with a backbone having peptide linkages. Each subunit has attached a naturally occurring or non-naturally occurring base. One such backbone is constructed of repeating units of N-(2-aminoethyl)glycine linked through amide bonds.
The present invention also provides PNA monomers, which are useful, for example, in the preparation of the PNA oligomers of the invention. In some preferred embodiments the PNA monomers of the present invention have an achiral backbone. One preferred example of an achiral PNA backbone is the 2-aminoethylglycine backbone. See, for example, International patent applications WO 92/20702 and WO 92/20703, the contents of each of which are incorporated herein by reference.
In other preferred embodiments, the invention provides PNA monomers containing a chiral backbone. In some preferred embodiments, chirality is introduced into the PNA backbone through the incorporation of an aliphatic cyclic structure. In one particularly preferred embodiment, the aliphatic cyclic structure includes the xcex1 and xcex2 carbons of the 2-aminoethyl portion of an aminoethylglycine backbone, and has the formula: 
where b is an integer from 0 to 4; xcex1 denotes the carbon that is adjacent to the glycyl amino group; and xcex2 denotes the carbon that is one adjacent to the a carbon. The aliphatic cyclic structure may be a 4, 5, 6 or 7 membered ring. In preferred embodiments the aliphatic cyclic structure is a 5 or 6 membered ring, with 6 being especially preferred.
The use of optically active reagents permits the synthesis of pure SS, RR, SR, and RS isomers. The SS isomer is preferred in some embodiments of the present invention.
Typically, monomers having a chiral backbone are prepared using (1,2)-diaminocyclohexane, which is available as the cis, or the trans isomer. The cis-(1,2)-diaminocyclohexane is a meso compound. Use of such meso compound requires resolution of a racemic mixture. The trans-(1,2)-diaminocyclohexane is commercially available in enantiomerically pure form, making it well suited for monomers of predetermined chirality about both the Cxcex1 and the Cxcex2 of the 2-aminoethyl portion of the backbone.
The diamine is typically protected at one of the amino groups with di-t-butylpyrocarbonate (Boc2O), followed by N-alkylation with methyl bromoacetate to give the chiral backbone. Coupling of a ligand (suitably protected where necessary) with the chiral backbone using DCC/DhbtOH followed by basic hydrolysis will give the desired monomer containing the chiral backbone. In this manner the SS and RR monomers may be synthesized. The RS and the SR isomers can be synthesized using the cis-(1,2)-diaminocyclohexane, and resolving the resulting racemic mixture. Resolution can be achieved, for example, by liquid chromatography.
The resulting monomer has increased conformational restriction, and is expected to increase the lipophilicity of the monomer. PNA monomers containing chiral backbones are disclosed in copending U.S. Pat. No. 5,972,296, the contents of which are hereby incorporated by reference in their entirety.
PNA oligomers comprising at least one chiral monomer of the invention are prepared in accordance with methods known to those skilled in the art. Established methods for the stepwise or fragmentwise solid-phase assembly of amino acids into peptides normally employ a beaded matrix of slightly cross-linked styrene-divinylbenzene copolymer, the cross-linked copolymer having been formed by the pearl polymerization of styrene monomer to which has been added a mixture of divinylbenzenes. A level of 1-2% cross-linking is usually employed. Such a matrix also can be used in solid-phase PNA synthesis in accordance with the present invention.
In some preferred embodiments, the PNA oligomers of the invention contain one or more chiral monomeric subunits. The PNA oligomers of the invention can contain one chiral subunit, a plurality of chiral subunits, or can be composed primarily or entirely of chiral subunits.
Preferably, the PNA oligomer is prepared to be complementary to a target molecule, i.e. at least a portion of the PNA oligomer has the ability to hybridize due to Watson-Crick base pair attraction to the target molecule, or due to Hoogsteen hydrogen bonds in triplex structures.
In preferred embodiments the aminoalkyl nitrogen of the PNA backbone can bear a substituent, which is denoted R8 in Formulas II and IV. Preferably, R8 is hydrogen, COCH3, or an amino protecting group.
Functional groups present on the compounds of the invention may contain protecting groups. Protecting groups are known per se as chemical functional groups that can be selectively appended to and removed from functionalities, such as amino groups and carboxyl groups. These groups are present in a chemical compound to render such functionality inert to chemical reaction conditions to which the compound is exposed. Any of a variety of protecting groups may be employed with the present invention. One preferred protecting group for amino groups is the Boc group. Other preferred protecting groups according to the invention may be found in Greene, T. W. and Wuts, P. G. M., xe2x80x9cProtective Groups in Organic Synthesisxe2x80x9d 2d. Ed., Wiley and Sons, 1991.
Substituent R9 is hydrogen, or the sidechain of a naturally occurring amino acid. As used herein, the term xe2x80x9camino acidxe2x80x9d denotes a molecule containing both an amino group and a carboxyl group, and has the general formula CH(COOH)(NH2)xe2x80x94(side chain). A naturally occurring amino acid is an amino acid that is found in nature; i.e., one that is produced by living organisms. One representative amino acid side chain is the lysyl side chain, xe2x80x94(CH2)4xe2x80x94NH2. Other representative naturally occurring amino acids can be found, for example, in Lehninger, Biochemistry, Second Edition, Worth Publishers, Inc, 1975, pages 73-77.
In preferred embodiments R10 is O, NH, O-alkylene, or a lysine residue. As used herein, the term xe2x80x9calkylxe2x80x9d includes straight-chain, branched and cyclic hydrocarbon groups such as, for example, ethyl, isopropyl and cyclopropyl groups. Preferred alkyl groups have 1 to about carbon atoms. The term xe2x80x9calkylenexe2x80x9d denotes divalent alkyl groups; i.e., methylene (xe2x80x94CH2xe2x80x94), ethylene (xe2x80x94CH2CH2xe2x80x94), propylene (xe2x80x94CH2CH2CH2xe2x80x94), etc.
The term xe2x80x9calkoxyxe2x80x9d has its accustomed meaning as an xe2x80x94O-alkyl group. An xe2x80x9calkylthioxe2x80x9d group denotes a group of formula xe2x80x94S-alkyl. Halogens include fluorine, chlorine, Bromine, and iodine.
The term aryl is intended to denote monocyclic and polycyclic aromatic groups including, for example, phenyl, naphthyl, xylyl, pyrrole, and furyl groups. Although aryl groups (e.g., imidazole groups) can include as few as 3 carbon atoms, preferred aryl groups have 6 to about 14 carbon atoms, more preferably 6 to about 10 carbon atoms. Aralkyl and alkaryl groups according to the invention each include alkyl and aryl portions. Aralkyl groups are attached through their alkyl portions, and alkaryl groups are attached through their aryl portions. Benzyl groups provide one example of an aralkyl group, and p-toluyl provides an example of an alkaryl group. As used herein, the term xe2x80x9cheterocyclicxe2x80x9d denotes a ring system that includes at least one hetero atom, such as nitrogen, sulfur or oxygen. The term xe2x80x9cheteroarylxe2x80x9d specifically denotes aryl heterocyclic groups.
In the context of this invention, the term xe2x80x9cpolynucleotidexe2x80x9d refers to an oligomer or polymer of ribonucleic acid or deoxyribonucleic acid.
In the PNA monomers of the present invention, an adenosine-thymidine nucleobase pair recognition moiety is connected to the PNA backbone by a tether, denoted as substituent A in Formula I. In some preferred embodiments, the tether terminates in a carbonyl group, which is preferably attached to a nitrogen atom of the PNA backbone. In more preferred embodiments the tether terminates in a carbonyl group which is attached to the glycyl nitrogen of a 2-aminoethylgylcine backbone. In further preferred embodiments the tether terminates in a carbonyl group which is attached to the glycyl nitrogen of a chiral derivative of a 2-aminoethylgylcine backbone, wherein the xcex1 and xcex2 carbons of the 2-aminoethyl portion of the aminoethylglycine backbone participate in an alicyclic ring, as described above.
In some preferred embodiments the portion of the tether attached to the PNA-bound carbonyl group has the formula: 
where j and z are each, independently, from 0 to about 5 with the sum of j and z being from 1 to 7; G is C(xe2x95x90O), S(O)2, phenyl or P(O)2; X is NH, S, or CH2; and a, h and g are each independently 0 or 1. In some preferred embodiments, the tether is alkyl or alkylamino, preferably having fewer than about six carbons, with two carbons being especially preferred. Particularly preferred tethers include xe2x80x94CH2xe2x80x94CH2xe2x80x94NHxe2x80x94, xe2x80x94CH2xe2x80x94, xe2x80x94CH2xe2x80x94CH2xe2x80x94, xe2x80x94Oxe2x80x94CH2xe2x80x94CH2xe2x80x94, and xe2x80x94Oxe2x80x94CH2xe2x80x94CH2xe2x80x94CH2xe2x80x94 groups. It is desirable to select the tether such that the ligand has the proper placement and orientation to maximize the interaction between the ligand and the AT pair (especially thymine) residing on complementary positions in the triplex structures. Preferably, the tether is linear and contains from 3 to 6 atoms in the linear chain, more preferably 4 or 5 atoms, with 4 being especially preferred.
In some preferred embodiments, at least one PNA monomer having a chiral center in the ethyl portion thereof is incorporated into the PNA oligomer at the site where a mismatch (i.e. variability of the target molecule) is expected or known to occur.
The PNA oligomers of the invention can have a variety of substituents attached thereto. For example, in some preferred embodiments the oligomers of the invention have a conjugate group attached, to afford easier detection or transport of the PNA. The conjugate group can be a reporter enzyme, a reporter molecule, a steroid, a carbohydrate, a terpene, a peptide, a protein, an aromatic lipophilic molecule, a non aromatic lipophilic molecule, a phospholipid, an intercalator, a cell receptor binding molecule, a crosslinking agent, a water soluble vitamin, a lipid soluble vitamin, an RNA cleaving complex, a metal chelator, a porphyrin, an alkylator, and polymeric compounds such as polymeric amines, polymeric glycols and polyethers. PNAs of the present invention can include one or more conjugates attached directly or through an optional linking moiety. When so derivatized, the PNA is useful, for example, as a diagnostic or therapeutic agent, to render other properties to a complementary nucleic acid or triplex in a test structure or to transfer a therapeutic or diagnostic agent across cellular membranes.
The conjugate group can be attached to the PNA oligomers of the invention anywhere on the PNA backbone, either on the monomeric unit of Formula II, or elsewhere on the PNA. The conjugate group can be attached to a monomer, and incorporated into the PNA oligomer. Alternatively, the conjugate group can be attached to the PNA oligomer after assembly from constituent monomers. Methods for the attachment of conjugate groups can be found in copending U.S. application ser. No. 08/319,411, filed Oct. 6, 1994, the contents of which are incorporated by reference in their entirety.
In some particularly preferred embodiments, the oligomeric compounds of the invention bear a reporter molecule such as a chromaphore or a fluorophore, for example fluorescein or rhodamine. For example, PNA oligomers of the invention, including those having at least one chiral monomer, are easily derivatized to include a fluorescein or rhodamine using an aminohexanoic linker group. These derivatized PNA oligomers are well suited for use as probes for a section of DNA of interest. Those skilled in the art will appreciate that the present invention is amenable to a variety of other types of labeling reagents and linkers.
The adenosine-thymidine nucleobase pair recognition moieties of the compounds of the present invention, represented by substituent L in Formulas I-IV, are surrogates for nucleobases that are ordinarily found in triple helix strands at positions complementary to thymidine (i.e., complementary to adenosine-thymidine base pairs). The adenosine-thymidine nucleobase pair recognition moieties of the invention can also be used as surrogates for thymine in antisense applications, by both duplex and triplex motifs.
Triplex structures which incorporate oligomers of the invention, which have a monomeric unit containing an adenosine-thymidine nucleobase pair recognition moiety at a position complementary to a Watson-Crick adenosine-thymidine base pair (that is, which have a monomeric unit of Formula I or II at a position complementary to the Watson-Crick adenosine-thymidine base pair), display increased binding (i.e., have a higher melting temperature) relative to otherwise identical triplex structures having a natural nucleobase at the site complementary to the Watson-Crick adenosine-thymidine base pair. Accordingly, the compounds of the invention are able to xe2x80x9crecognizexe2x80x9d thymine residues in triple helix structures by this increased binding. Thus, the term xe2x80x9cadenosine-thymidine nucleobase pair recognition moietyxe2x80x9d as defined herein is a non-natural heterocyclic moiety that, when substituted for a natural nucleobase able to bind to Watson-Crick base pairs in triple helix structures, forms a triplex structure having increased binding relative to identical triplexes not having the adenosine-thymidine nucleobase pair recognition moiety.
In some preferred embodiments of the invention the adenosine-thymidine nucleobase pair recognition moiety is a C-pyrimidine (e.g. a pyrimidine in which the linkage connecting the pyrimidine to the backbone either with or without a tether is made through a carbon and not a nitrogen atom) or an iso-pyrimidine.
The adenosine-thymidine nucleobase pair recognition moieties and tethers of the compounds of the invention, represented by xe2x80x94Txe2x80x94L in Formulas I-IV, are selected to afford the maximum affinity for complementary thymine residues in the DNA portion of triplex structures, for example PNA:DNA-PNA triplex structures. Although not wishing to be bound by a specific theory, it is believed that in order to recognize thymine of a T-A base-pair in the major groove of a Watson and Crick double helix structure, it is preferred that the ligand be connected to the PNA backbone with a tether that allows sufficient freedom to circumvent the 5-methyl group of thymine. In addition, the selected ligand preferably has a hydrogen donor that can bind to the 4-oxo group of thymine. A further useful feature is the presence of a second functionality, located on or attached to the ligand, that can act as a hydrogen acceptor to form a hydrogen bond to the N-6 hydrogen atoms of adenine. It is believed that the compounds of the present invention that recognize thymine in a Watson and Crick double helix structure posses these attributes.
Adenosine-thymidine nucleobase pair recognition moieties can be selected by methods known to those of skill in the art. For example, adenosine-thymidine nucleobase pair recognition moieties can be selected by appropriate computer modeling programs, such as the Insight II and Discover programs, available from Biosym, San Diego, Calif.
In some preferred embodiments adenosine-thymidine nucleobase pair recognition moieties have the formula: 
In other preferred embodiments adenosine-thymidine nucleobase pair recognition moieties have the formula: 
wherein
Q is CH or N;
R17 is H or C1-C8 alkyl;
each R11 and R12 is, independently, H, C1-C8 alkyl, or halogen;
or R11 and R12 together with the carbon atoms to which they are attached form a phenyl group.
In another preferred embodiment, the tether connecting the adenosine-thymidine nucleobase pair recognition moiety, A in Formula I or T in Formula II, is xe2x80x94CH2xe2x80x94CH2xe2x80x94NHxe2x80x94, xe2x80x94CH2xe2x80x94, xe2x80x94CH2xe2x80x94CH2xe2x80x94, xe2x80x94Oxe2x80x94CH2xe2x80x94CH2xe2x80x94, or xe2x80x94Oxe2x80x94CH2xe2x80x94CH2xe2x80x94CH2xe2x80x94.
In particularly preferred embodiments, monomers of the present invention have the formula: 
L has the formula: 
Q is CH or N;
R17 is H or C1-C8 alkyl;
each R11 and R12 is, independently, H, C1-C8 alkyl, or halogen;
or R11 and R12 together with the carbon atoms to which they are attached form a phenyl group;
A is a single bond, a methylene group or a group of formula: 
xe2x80x83where:
X is O, S, Se, NR3, CH2 or C(CH3)2;
Y is a single bond, O, S or NR4;
each p, q, r and s is, independently, zero or an integer from 1 to 5;
each R1 and R2 is, independently, selected from the group consisting of hydrogen, (C1-C4)alkyl which may be hydroxy- or alkoxy- or alkylthio-substituted, hydroxy, alkoxy, alkylthio, amino and halogen;
each R3 and R4 is, independently, selected from the group consisting of hydrogen, (C1-C4)alkyl, hydroxy- or alkoxy- or alkylthio-substituted (C1-C4)alkyl, hydroxy, alkoxy, alkylthio and amino;
R15 is OH, a protected hydroxyl group, or a protecting group; and
R16 is H or a protecting group.
The PNA oligomers and linked PNA oligomers of the present invention are useful for forming PNA2/DNA triple helix structures. Preferred embodiments of the compounds of the invention include triple helix (i.e., triplex) PNA.DNA-PNA structures in which at least one of the PNA strands contains at least one monomer moiety in accordance with Formula I, preferably Formula II. In more preferred embodiments the two PNA oligomers in the PNA.DNA-PNA are constituent members of a bis-PNA; i.e., the two PNA oligomers are linked together by one or more linking moieties (xe2x80x9clinking groupsxe2x80x9d).
Linking moieties that link PNA oligomers in compounds of the invention are selected such that the PNA oligomers have sufficient freedom to permit formation of the triplex structure. A variety of groups can be used as linking moieties, for example xe2x80x9cegl groupsxe2x80x9d (ethylene glycol) and xe2x80x9cAha groupsxe2x80x9d (6-amino hexanoic acid) linked together by amino acid groups. A further linking segment includes the above Aha groups interspaced with xcex1-amino acids particularly glycine or lysine. One especially preferred linking moiety is one or more 8-amino-3,6-dioxaoctanoic acid residues which, effectively gives upon coupling, xe2x80x9cmultiple ethylene glycol unitsxe2x80x9d (xe2x80x9ceglxe2x80x9d).
A wide range of other compounds are also useful for the linking segment and thus are included within the scope of the present invention. Generally the linking segment is a compound having a primary amino group and a carboxy group separated with a space spanning group, wherein the space spanning group consists of one or more functional groups. Some representative space spanning groups are C1 to C20 alkyl, C2 to C20 alkenyl, C2 to C20 alkynyl, C1 to C20 alkanoyl having at least one O or S atom, C7 to C34 aralkyl, C6-C14 aryl and amino acids. Preferred alkanoyl groups can have from 1 to 10 hetero atoms such as O or S. Preferred alkanoyl groups include methyl, ethyl and propyl alkanoxy particularly polyethoxy, i.e., ethylene glycol. Amino acids including D, L, and DL isomers of xcex1-amino acids as well as longer chain amino acids may also be linked together to form a linking segment. A particularly preferred amino acid is 6-amino hexanoic acid. Aralkyl groups used as space spanning groups may have the amino or the carboxy group located on the aromatic ring or spaced with one or more CH2 groups wherein the total number of CH2 groups is less than or equal to twenty. The position of substitution in an aralkyl linked PNA may be varied; however, ortho and meta are presently preferred because substitution at these positions, especially ortho, induce the bis PNA to be bent, thus facilitating location of the two joined peptide nucleic acid strands in spacial locations parallel to one another. Another group of bis PNAs that include induced bends are those that incorporate cis-alkenyl linkers or a proline linker.
In selecting a linking segment, one consideration is compatibility with PNA chemistry, and the ability to link a functional group on one end of a PNA to a functional group on one end of a second PNA. Also, the linking segment can be selected so as to be flexible, such that the two linked PNAs are able to interact with ssDNA, ssRNA or dsDNA in similar fashion to the way that two independent PNA single strands would so interact. Some preferred linking segments that have been shown to be effective have lengths of 23 and 24 atoms.
The term xe2x80x9ccomplementaryxe2x80x9d as used herein has its accustomed meaning as the ability to form either Watson-Crick or Hoogsteen bonds within a nucleic acid (RNA or DNA) duplex, a PNA-nucleic acid duplex, a triplex structure including nucleic acid, PNA, or mixtures thereof.
In the PNA2/DNA compounds of the invention, PNA oligomers are typically prepared as Watson-Crick anti-parallel strands, and additional PNA oligomers are prepared as parallel (e.g. Hoogsteen) strands. In one preferred embodiment, the PNA monomers of the invention are used in the Hoogsteen strand in positions that are complementary to thymine or uracil in the target nucleic acid to increase the binding of the Hoogsteen strand, and hence increase the melting temperature (Tm) of the resulting triple helix that is formed.
As used herein, the term xe2x80x9cbinding affinityxe2x80x9d refers to the ability of a duplex to bind to a target molecule via hydrogen bonds, van der Waals interactions, hydrophobic interactions, or otherwise. Target molecules include single stranded DNA or RNA, as well as duplexes between DNA, RNA, and their analogs such as PNA.
As used herein, the term xe2x80x9cnucleobasexe2x80x9d has its accustomed meaning as a heterocyclic base that is capable of participating in Watson-Crick or Hoogsteen bonds in nucleic acid duplex or triplex structures. These include the natural nucleobases adenine, guanine, cytosine, thymine and uracil, as well as unnatural nucleobases (i.e., nucleobase analogs) that are known to mimic the function of the natural nucleobases in DNA or RNA analogs. Representative nucleobase analogs can be found in, for example, Antisense Research and Application, Ed. S. T. Crooke and B. Lebleu, Chapter 15, CRC Press, 1993, and U.S. Pat. No. 3,687,808 to Merigan, et al., the contents of which are hereby incorporated by reference in their entirety.
In some preferred embodiments, the PNA oligomers of the present invention form triple helix structures with nucleic acid targets wherein the PNA oligomer has an increased binding affinity relative to previously reported PNA oligomers. The PNA oligomers and linked PNA oligomers having PNA monomer moieties of Formula I-IV in positions complementary to thymine or uracil in a nucleic acid target show increased binding specificity relative to the same triplex structure formed with linked PNA oligomers having adenine in positions that are complementary to thymine or uracil (see Example 5, infra).
Thus, the PNA oligomers and linked PNA oligomers of the invention find use in applications where it is desired to detect or identify oligonucleotide sequences containing thymidine residues. Accordingly, the PNA oligomers of the present invention are useful as research reagents and as diagnostic tools. In one preferred embodiment, the oligomers and linked oligomers of the invention are useful for the detection of nucleic acid sequences suspected of being implicated in a disease state, which contain one or more thymine residues. Accordingly, the present invention includes triplex structures containing nucleic acid sequences suspected of being implicated in a disease state, and at least one PNA oligomer of the invention.
In some preferred embodiments, compositions of the invention including single stranded DNA coding for a sequence suspected of being implicated in a disease state and containing one or more thymine residues; a first peptide nucleic acid that comprises a region that is complementary to a region of the single stranded nucleic acid; and a second peptide nucleic acid comprising a sequence that is complementary to a region of the single stranded nucleic acid, the peptide nucleic acid oligomer having at one or more positions complementary to the thymine residues of the single stranded nucleic acid a residue of Formula I, preferably Formula II.
The present invention also provides methods for forming a triple helix compound comprising (a) selecting a single stranded nucleic acid containing one or more thymine residues; (b) providing a first oligomer that comprises a region that is complementary to a region of the single stranded nucleic acid; (c) contacting the single stranded nucleic acid and the first oligomer with a second oligomer, wherein the second oligomer comprises a sequence that is complementary to a region of the single stranded nucleic acid, and has at one or more positions complementary to the thymine residues of the single stranded nucleic acid a residue of Formula I or II for a time and under conditions effective to form the triple helix compound.
In some preferred embodiments, the single stranded nucleic acid will be selected for its biological activity, such as its pathogenic properties. The first oligomer will be then be synthesized to include a region that is complementary to a region of the single stranded nucleic acid, preferably a region with more than one thymine residue. The contacting of the single stranded nucleic acid and the first oligomer with the second oligomer may be accomplished by a variety of means, known in the art, that promotes triplex formation. See for example, U.S. patent application Ser. No. 08/088,661, for in vitro determination of a nucleic acid in a sample which may be made by pipetting one or more solutions containing said first and second oligomer as well as optionally all reagents necessary for effective triplex formation to the sample.
In preferred embodiments, methods are provided for the detection of a chemical or microbiological entity which contains a known nucleobase sequence comprising a) selecting a nucleobase sequence from the chemical or microbiological entity which contains one or more thymine residues; b) providing a PNA oligomer that contains a region that is complementary to the selected nucleobase sequence; c) contacting the selected nucleobase sequence of the chemical or microbiological entity and the complementary PNA oligomer with a further PNA oligomer which contains a sequence that is complementary to the selected nucleobase sequence, wherein the further PNA oligomer has at one or more positions complementary to the thymine residues of the selected nucleobase sequence a residue of Formula I or II to form a triple helix compound; and d) detecting the triple helix compound.
Detection of the triple helix compound can be accomplished by detection of a reporter molecule, such as a chromophore or fluorophore, that is bound covalently or non-covalently to the compound of the invention as described above. Useful conjugates are described in WO 95/14708, WO 95/16202 and WO 92/20703. Alternatively, detection may be by any of several means, including Tm studies, mass measurements and crosslinking studies.
The further PNA oligomer can alternatively have attached a moiety enabling immobilization on a solid support, such as polystyrene. One example of such a moiety is biotin which can be bound to polystyrene via a coating of streptavidine. Formation of the triple helical structure can then be accomplished by labels attached to either the other PNA oligomer or to the nucleobase sequence a chemical entity.
Chemical entities are understood to include chemically synthesized oligomers and chemically or enzymatically (for example via PCR) amplified nucleobase sequences containing compounds. Microbiological entities in connection with this invention are understood to include cells, for example from animals, vertebrates, bacterial or plants, or viruses, like HIV or HBV, plasmids or genomes.
In these analytical methods, the formation of the triplex structure having the further PNA oligomer incorporated is taken as an indication of the presence of the entity in the sample analyzed, for example the sample containing the chemical or microbiological entity.
The present invention also provides methods for sequence-specific recognition of a double-stranded polynucleotide, comprising contacting said polynucleotide with a compound of Formula I or II.
The present invention further provides methods for sequence-specific recognition of a double-stranded polynucleotide, comprising contacting the polynucleotide with an oligomeric compound that binds to the polynucleotide to form a triplex structure, the oligomeric compound comprising a monomeric unit having the Formula I or II.