This invention is directed to compounds that are not polynucleotides yet which bind to complementary DNA and RNA strands more strongly than corresponding polynucleotides. In particular, the invention concerns novel peptide nucleic acid compounds and novel linked peptide nucleic acid compounds wherein naturally-occurring nucleobases or other nucleobase-binding moieties are covalently bound to a polyamide backbone which is covalently linked via a linking moiety to a second similarly substituted polyamide backbone.
Oligonucleotides and their analogs have been developed and used in molecular biology in certain procedures as probes, primers, linkers, adapters, and gene fragments. Modifications to oligonucleotides used in these procedures include labeling with non isotopic labels, e.g. fluorescein, biotin, digoxigenin, alkaline phosphatase, or other reporter molecule. Other modifications have been made to the ribose phosphate backbone to increase the nuclease stability of the resulting analog. These modifications include use of methyl phosphonates, phosphorothioates, phosphorodithioate linkages, and 2xe2x80x2-O-methyl ribose sugar units. Further modifications, include modification made to modulate uptake and cellular distribution. Phosphorothioate oligonucleotides are presently being used as antisense agents in human clinical trials for various disease states including use as antiviral agents. With the success of these oligonucleotides for both diagnostic and therapeutic uses, there exists an ongoing demand for improved oligonucleotide analogs.
Oligonucleotides can interact with native DNA and RNA in several ways. One of these is duplex formation between an oligonucleotide and a single stranded nucleic acid. The other is triplex formation between an oligonucleotide and double stranded DNA to form a triplex structure; however, to form a triplex structure with a double stranded DNA, the cytosine bases of the oligonucleotide must be protonated. This thus renders such triplexing pH dependent. P.O.P. Ts""o and associates have used pseudo isocytosine as a permanently protonated analogue of cytosine in DNA triplexing (see Ono, et al., J. Am. Chem. Soc., 1991, 113, 4032-4033; Ono, et. al., J. Org. Chem., 1992, 57, 3225-3230). Trapane and Ts""o have also suggested the us of pseudo isocytosine for triplex formation with singe-stranded nucleic acid targets. (see, Trapane, et. al., J. Biomol. Strul. Struct., 1991, 8, 229; Trapane, et. al., Biophys. J., 1992, 61, 2437; and Trapane, et. al., Abstracts Conference on Nucleic Acids Medical Applications, Cancun, Mexico, January 1993). 8-Oxoadenine was also suggested in patent application WO 93/05180 for protonated cytosine in triplex formation.
Peptide nucleic acids are compounds that in certain respects are similar to oligonucleotide analogs however in other very important respects their structure is very different. In peptide nucleic acids, the deoxyribose phosphate backbone of oligonucleotides has been replaced with a backbone more akin to a peptide than a sugar phosphodiester. Each subunit has a naturally occurring or non naturally occurring base attached to this backbone. One such backbone is constructed of repeating units of N-(2-aminoethyl)glycine linked through amide bonds. Because of the radical deviation from the deoxyribose backbone, these compounds were named peptide nucleic acids (PNAs).
PNA binds both DNA and RNA to form PNA/DNA or PNA/RNA duplexes. The resulting PNA/DNA or PNA/RNA duplexes are bound with greater affinity than corresponding DNA/DNA or DNA/RNA duplexes as determined by Tm""s. This high thermal stability might be attributed to the lack of charge repulsion due to the neutral backbone in PNA. The neutral backbone of the PNA also results in the Tm""s of PNA/DNA(RNA) duplex being practically independent of the salt concentration. Thus the PNA/DNA duplex interaction offers a further advantage over DNA/DNA duplex interactions which are highly dependent on ionic strength. Homopyrimidine PNAs have been shown to bind complementary DNA or RNA forming (PNA)2/DNA(RNA) triplexes of high thermal stability (see, e.g., Egholm, et al., Science, 1991, 254, 1497; Egholm, et al., J. Am. Chem. Soc., 1992, 114, 1895; Egholm, et al., J. Am. Chem. Soc., 1992, 114, 9677).
In addition to increased affinity, PNA has also been shown to bind to DNA with increased specificity. When a PNA/DNA duplex mismatch is melted relative to the DNA/DNA duplex there is seen an 8 to 20xc2x0 C. drop in the Tm. This magnitude of a drop in Tm is not seen with the corresponding DNA/DNA duplex with a mismatch present.
The binding of a PNA strand to a DNA or RNA strand can occur in one of two orientations. The orientation is said to be anti-parallel when the DNA or RNA strand in a 5xe2x80x2 to 3xe2x80x2 orientation binds to the complementary PNA strand such that the carboxyl end of the PNA is directed towards the 5xe2x80x2 end of the DNA or RNA and amino end of the PNA is directed towards the 3xe2x80x2 end of the DNA or RNA. In the parallel orientation the carboxyl end and amino end of the PNA are just the reverse with respect to the 5xe2x80x2-3xe2x80x2 direction of the DNA or RNA.
PNAs bind to both single stranded DNA and double stranded DNA. As noted above, in binding to double stranded DNA it has been observed that two strands of PNA can bind to the DNA. While PNA/DNA duplexes are stable in the antiparallel configuration, it was previously believed that the parallel orientation is preferred for (PNA)2/DNA triplexes.
The binding of two single stranded pyrimidine PNAs to a double stranded DNA has been shown to take place via strand displacement, rather than conventional triple helix formation as observed with triplexing oligonucleotides. When PNAs strand invade double stranded DNA, one strand of the DNA is displaced and forms a loop on the side of the PNA2/DNA complex area. The other strand of the DNA is locked up in the (PNA)2/DNA triplex structure. The loop area (alternately referenced as a P loop) being single stranded, is susceptible to cleavage by enzymes that can cleave single stranded DNA.
A further advantage of PNA compared to oligonucleotides is that their polyamide backbone (having appropriate nucleobases or other side chain groups attached thereto) is not recognized by either nucleases or proteases and are not cleaved. As a result PNAs are resistant to degradation by enzymes unlike DNA and peptides.
Because of their properties, PNAs are known to be useful in a number of different areas. Since PNAs having stronger binding and greater specificity than oligonucleotides, they are used as probes in cloning, blotting procedures, and in applications such as fluorescence in situ hybridization (FISH). Homopyrimidine PNAs are used for strand displacement in homopurine targets. The restriction sites that overlap with or are adjacent to the P-loop will not be cleaved by restriction enzymes. Also, the local triplex inhibits gene transcription. Thus in binding of PNAs to specific restriction sites within a DNA fragment, cleavage at those sites can be inhibited. Advantage can be taken of this in cloning and subcloning procedures. Labeled PNAs are also used to directly map DNA molecules. In effecting this, PNA molecules having a fluorescent label are hybridized to complementary sequences in duplex DNA using strand invasion.
PNAs have further been used to detect point mutations in PCR-based assays (PCR clamping). PCR clamping uses PNA to detect point mutations in a PCR-based assay, e.g. the distinction between a common wild type allele and a mutant allele, in a segment of DNA under investigation. A PNA oligomer complementary to the wild type sequence is synthesized. The PCR reaction mixture contains this PNA and two DNA primers, one of which is complementary to the mutant sequence. The wild type PNA oligomer and the DNA primer compete for hybridization to the target. Hybridization of the DNA primer and subsequent amplification will only occur if the target is a mutant allele. With this method, one can determine the presence and exact identity of a mutant.
It is an object of this invention to provide compounds that bind ssDNA, dsDNA and ssRNA nucleic acids to form complexes with improved thermal stability, specificity, and other properties relative to corresponding DNA.
It is a further object of this invention to provide compounds that bind nucleic acids via strand invasion using two sequences of PNA which may be linked together to form a bis PNA wherein one strand binds anti-parallel relative to the target utilizing Watson/Crick type hydrogen bonds and the second strand binds parallel relative to the target utilizing Hoogsteen type hydrogen bonds.
It is a further object of this invention to provide PNAs and bis PNAs wherein C-pyrimidine heterocyclic bases or iso pyrimidine heterocyclic bases are substituted in place of at least one pyrimidine heterocyclic base.
It is a further object of this invention to provide compounds that bind nucleic acids via strand invasion using two sequences of PNA which may be linked together wherein the cytosines of the parallel strand relative to the target have beer. replaced with pseudo isocytosines to form a bis PNA wherein one strand binds anti-parallel relative to the target forming Watson/Crick type hydrogen bonds and the second strand binds parallel relative to the target forming Hoogsteen type hydrogen bonds.
It is a further object of this invention to provide bis PNA structures wherein the cytosine nucleobases are replaced with pseudo isocytosines in the Hoogsteen strand.
It is a further object of this invention to provide therapeutic, diagnostic, and prophylactic methods that employ such compounds.
The present invention is directed to modified peptide nucleic acids especially PNAs that are linked via a linking segment. Such PNAs have been given the short hand name xe2x80x9cbis peptide nucleic acidsxe2x80x9d or xe2x80x9cbis PNAs.xe2x80x9d The present invention is also directed to modified peptide nucleic acids that incorporate certain non-natural nucleobases for Hoogsteen type base paring. These modified peptide nucleic acids are particularly useful for diagnostic uses, including the identification of certain sites in double stranded DNA, restriction enzyme sites, transcription inhibition, clamping to detect point mutations and for use in Hoogsteen strands in triplexing motif.
In accordance with this invention there are provided compounds that include a peptide nucleic acid that has at least one peptide nucleic acid monomeric unit having a pyrimidine heterocyclic base that is a C-pyrimidine heterocyclic base or an iso-pyrimidine heterocyclic base. In certain preferred embodiments of this invention the pyrimidine heterocyclic base is a C-pyrimidine heterocyclic base. In other preferred embodiments of this invention the pyrimidine heterocyclic base. is pseudo-isocytosine. In a further embodiment of the invention the C-pyrimidine heterocyclic base is pseudo-uracil, 5-bromouracil, iso-cytosine or other iso-pyrimidine heterocyclic base.
Compounds of the invention, including compounds having C-pyrimidines and iso-pyrimidine heterocyclic bases, include compounds of formula I: 
wherein:
n is at least 2,
each of L1-Ln is independently selected from the group consisting of hydrogen, hydroxy, (C1-C4) alkanoyl, naturally occurring nucleobases, non-naturally occurring nucleobases, aromatic moieties, DNA intercalators, nucleobase-binding groups, heterocyclic moieties, and reporter ligands;
each of C1-Cn is (CR6R7)y where R6 is hydrogen and R7 is selected from the group consisting of the side chains of naturally occurring alpha amino acids, or R6 and R7 are independently selected from the group consisting of hydrogen, (C2-C6)alkyl, aryl, aralkyl, heteroaryl, hydroxy, (C1-C6)alkoxy, (C1-C6) alkylthio, NR3R4 and SR5, where R3 and R4 are each independently selected from the group consisting of hydrogen, (C1-C4)alkyl, hydroxy- or alkoxy- or alkylthio-substituted (C1-C4) alkyl, hydroxy, alkoxy, alkylthio and amino, and R5 is hydrogen, (C1-C6)alkyl, hydroxy-, alkoxy-, or alkylthio-substituted (C1-C6)alkyl, or R6 and R7 taken together complete an alicyclic or heterocyclic system;
each of D1-Dn is (CR6R7)z where R6 and R7 are as defined above;
each of y and z is zero or an integer from 1 to 10, the sum y+z being greater than 1 but not more than 10;
each of G1-Gn-1 is xe2x80x94NR3COxe2x80x94, xe2x80x94NR3CSxe2x80x94, xe2x80x94NR3SOxe2x80x94 or xe2x80x94NR3SO2xe2x80x94, in either orientation, where R3 is as defined above;
each of A1-An and B1-Bn are selected such that:
(a) A is a group of formula (IIa), (IIb), (IIc) or (IId), and B is N or R3N+; or
(b) A is a group of formula (IId) and B is CH; 
where:
X is O, S, Se, NR3, CH2 or C(CH3)2;
Y is a single bond, 0, S or NR4; each of p and q is zero or an integer from 1 to 5, the sum p+q being not more than 10;
each of r and s is zero or an integer from 1 to 5, the sum r+s being not more than 10;
each R1 and R2 is independently selected from the group consisting of hydrogen, (C1-C4)alkyl which may be hydroxy- or alkoxy- or alkylthio-substituted, hydroxy, alkoxy, alkylthio, amino and halogen; and
each R3 and R4 are as defined above;
Q is xe2x80x94CO2H, xe2x80x94CONRxe2x80x2Rxe2x80x3, xe2x80x94SO3H or xe2x80x94SO2NRxe2x80x2Rxe2x80x3 or an activated derivative of xe2x80x94CO2H or xe2x80x94SO3H; and
I is xe2x80x94NHRxe2x80x2xe2x80x3Rxe2x80x2xe2x80x3Rxe2x80x3xe2x80x3 or xe2x80x94NRxe2x80x2xe2x80x3C(O)Rxe2x80x3xe2x80x3, where Rxe2x80x2, Rxe2x80x3, Rxe2x80x2xe2x80x3 and Rxe2x80x3xe2x80x3 are independently selected from the group consisting of hydrogen, alkyl, amino protecting groups, reporter ligands, intercalators, chelators, peptides, proteins, carbohydrates, lipids, steroids, nucleosides, nucleotides, nucleotide diphosphates, nucleotide triphosphates, oligonucleotides, oligonucleosides and soluble and non-soluble polymers.
Peptide nucleic acids compounds of the invention further include compounds of structure III, IV or V: 
wherein:
each L is independently selected from the group consisting of hydrogen, phenyl, heterocyclic moieties, naturally occurring nucleobases, and non-naturally occurring nucleobases;
each R7xe2x80x2 is independently selected from the group consisting of hydrogen and the side chains of naturally occurring alpha amino acids; 
n is an integer greater than 1,
each k, l, and m is, independently, zero or an integer from 1 to 5;
each p is zero or 1;
Rh is OH, NH2 or xe2x80x94NHLysNH2; and
Ri is H or COCH3.
Further in accordance with this invention there are provided compounds having a first and a second peptide nucleic segments that are joined together via at least one linking segment that is not a peptide nucleic acid or an oligonucleotide.
In preferred embodiments of the invention, the linking segment includes a linear structure having a carboxylic aced functional group on one end thereof and a primary amino functional group on the other end thereof. Preferred linking segments includes at least one unit of the structure:
xe2x80x94[HNxe2x80x94Zxe2x80x94COO]nxe2x80x94
wherein n is 1 to 3; and Z is C1-C20 alkyl, C2-C20 alkenyl, C2-C20 alkynyl, C1-C20 alkanoyl having at least one O or S hetero atom, C1-C17 aryl, or C7-C34 aralkyl.
In a more preferred embodiment, the linking segment includes at least one aminoalkylcarboxylic acid of the formula:
xe2x80x94NHxe2x80x94(CH2)exe2x80x94C(xe2x95x90O)xe2x80x94
where e is 1 to 15.
In certain preferred embodiments e is from 4 to 8. In a more preferred embodiment e is 5 or 6.
In other preferred embodiments, the linking segment includes structures of the immediate above formula and at least one further xcex1-amino acid such that they are of formula:
xe2x80x94(AA)hxe2x80x94[NHxe2x80x94(CH2)exe2x80x94C(xe2x95x90O)xe2x80x94(AA)f]gxe2x80x94
where:
AA is an xcex1-amino acid;
e is 4 to 8;
f and h are 0 or 1; and
g is 1 to 4.
In further preferred embodiments, the linking segment includes at least one unit of a glycol amino acid. The glycol amino acid is formed of glycol sub-units linked together in a linear array and having an amino group on one terminus and a carboxyl group on the other terminus. Preferred glycol amino acid linking segments are compounds of the formula:
xe2x80x94[NHxe2x80x94(CH2xe2x80x94CH2xe2x80x94Oxe2x80x94)jxe2x80x94CH2xe2x80x94C(xe2x95x90O)xe2x80x94]i
wherein j is 1 to 6; and i is 1 to 6. In one particularly preferred embodiment, j is 2 and i is 3.
In a further embodiment of the invention, both of the ends of two respective peptide nucleic acid segments are joined together via two of the linking segments to form a cyclic structure.
In a further embodiment of the invention, the linking segment connects a terminal amine function on one of first and second peptide nucleic acid segments to a carboxyl function on the other of first and second peptide nucleic acid segments.
In certain preferred embodiments of the invention, the nucleobase sequence of the first peptide nucleic acid segment, in a direction from its amine terminus to its carboxyl terminus, is the same as the nucleobase sequence of the second peptide nucleic acid segment, in a direction from its carboxyl terminus to its amine terminus.
In other embodiments of the invention, at least a portion of the nucleobases of the first and second peptide nucleic acid segments are pyrimidine nucleobases. In a further embodiment of the invention, at least one of the pyrimidine nucleobases of one of the first or the second peptide nucleic acid segments comprises a C-pyrimidine heterocyclic base or an iso-pyrimidine heterocyclic base. In a further embodiment of the invention, a portion of the nucleobases that are pyrimidine nucleobases are located in contiguous homopyrimidine sequences.
Compounds of the invention also include multiple stranded structures having a nucleic acid strand, at least a portion of which forms a target nucleotide sequence, and a further strand, formed from first and second peptide nucleic acid segments that, in turn, are joined together via a linker. The sequence of the nucleobases of the first peptide nucleic acid segment is selected to be complementary to the target nucleotide sequence in the 5xe2x80x2 to 3xe2x80x2 direction of the target nucleotide sequence and the sequence of the nucleobases of the second peptide nucleic acid segment is selected to be complementary to the target nucleotide sequence in the 3xe2x80x2 to 5xe2x80x2 direction of the target nucleotide sequence.
In certain embodiments of the invention the nucleic acid strand is a single stranded DNA or RNA and in further embodiments of the invention the nucleic acid strand is a double stranded DNA.
In still a further embodiment of the invention one of the first or second peptide nucleic acid segments binds to the target nucleotide sequence utilizing Watson/Crick type hydrogen bonding and the other of the first or second peptide nucleic acid segments binds to the target nucleotide sequence utilizing Hoogsteen type hydrogen bonding. In a preferred embodiment, the one of the first or second peptide nucleic acid segments that binds to the target nucleotide sequence utilizing said Hoogsteen hydrogen bonding includes C-pyrimidine heterocyclic nucleobases or iso-pyrimidine heterocyclic nucleobases in at least one of the positions that are complementary to nucleobases in the target nucleotide sequence. In certain preferred embodiments the C-pyrimidine heterocyclic nucleobase or iso-pyrimidine heterocyclic nucleobase are selected as pseudo-isocytosine, iso-cytosine, pseudo-uracil or 5-bromouracil.
Compounds of the invention also include a compound having a first segment of joined peptide nucleic acid units having a first sequence of nucleobases and a second segment of joined peptide nucleic acid units having second sequence of nucleobases and a linker group linking the first and the second segments of peptide nucleic acid units. The first segment of peptide nucleic acid units extends from an amino end to a carboxyl end and the second segment of peptide nucleic acid units extends from an amino end to a carboxyl end with the linker group linking the carboxyl end of the first segment of peptide nucleic acid units to the amino end of the second segment of peptide nucleic acid units.