This invention is directed to generally linear compounds or xe2x80x9cstrandsxe2x80x9d wherein naturally-occurring nucleobases or other nucleobase-binding moieties preferably are covalently bound to a polyamide backbone. In particular, the invention concerns compounds wherein two such strands coordinate through hydrogen bonds to form a DNA-like double strand.
The transcription and processing of genomic duplex DNA is controlled by generally proteinaceous transcription factors that recognize and bind to specific DNA sequences. One strategy for the control of gene expression is to add to a cell double-stranded DNA or double-stranded DNA-like structures that will bind to the desired factor in preference to or in competition with genomic DNA, thereby inhibiting processing of the DNA into a protein. This modulates the protein""s action within the cell and can lead to beneficial effects on cellular function. Naturally occurring or unmodified oligonucleotides are unpractical for such use because they have short in vivo half-lives and they are poor cell membrane penetrators.
These problems have resulted in an extensive search for improvements and alternatives. In order to improve half-life as well as membrane penetration, a large number of variations in polynucleotide backbones has been undertaken. These variations include the use of methylphosphonates, phosphoro-thioates, phosphordithioates, phosphoramidates, phosphate esters, bridged phosphoroamidates, bridged phosphorothioates, bridged methylenephosphonates, dephospho internucleotide analogs with siloxane bridges, carbonate bridges, carboxymethyl ester bridges, acetamide bridges, carbamate bridges, thioether, sulfoxy, sulfono bridges, various xe2x80x9cplasticxe2x80x9d DNAs, xcex1-anomeric bridges, and borane derivatives. The great majority of these backbone modifications lead to decreased stability for hybrids formed between the modified oligonucleotide and its complementary native oligonucleotide, as assayed by measuring Tm values.
Consequently, there remains a need in the art for stable compounds that can form double-stranded, helical structures mimicking double-stranded DNA.
It is one object of the present invention to provide compounds that mimic the double-helical structure of DNA.
It is a further object of the invention to provide compounds wherein linear, polymeric strands coordinate through hydrogen bonds to form double helices.
It is another object to provide compounds wherein naturally-occurring nucleobases or other nucleobase-binding moieties are covalently bound to a non-sugar-phosphate backbone.
It is yet another object to provide therapeutic, diagnostic, and prophylactic methods that employ such compounds.
The present invention provides a novel class of compounds, known as peptide nucleic acids (PNAs), that can coordinate with one another or with single-stranded DNA to form double-stranded (i.e., duplex) structures. The compounds include homopolymeric PNA strands and heteropolymeric PNA strands (e.g., DNA/PNA strands), which coordinate through hydrogen bonding to form helical structures. Duplex structures can be formed, for example, between two complementary PNA or PNA/DNA strands or between two complementary regions within a single such strand.
In certain embodiments, each strand of the double-stranded compounds of the invention includes a sequence of ligands covalently bound by linking moieties and at least one of said linking moieties comprising an amide, thioamide, sulfinamide or sulfonamide linkage. The ligands on one strand hydrogen bond with ligands on the other strand and, together, assume a double helical structure. The compounds of the invention preferably comprise ligands linked to a polyamide backbone. Representative ligands include either the four main naturally occurring DNA bases (i.e., thymine, cytosine, adenine or guanine) or other naturally occurring nucleobases (e.g., inosine, uracil, 5-methylcytosine or thiouracil) or artificial bases (e.g., bromothymine, azaadenines or azaguanines, 5-propynylthymine, etc.) attached to a peptide backbone through a suitable linker. These ligands are linked to the polyamide backbone through aza nitrogen atoms or through amido and/or ureido tethers.
In certain preferred embodiments, the peptide nucleic acids of the invention have the general formula (I): 
wherein:
n is at least 2,
each of L1-Ln is independently selected from the group consisting of hydrogen, hydroxy, (C1-C4)alkanoyl, naturally occurring nucleobases, non-naturally occurring nucleobases, aromatic moieties, DNA intercalators, nucleobase-binding groups, heterocyclic moieties, and reporter ligands, at least one of L1-Ln being a naturally occurring nucleobase, a non-naturally occurring nucleobase, a DNA intercalator, or a nucleobase-binding group;
each of C1-Cn is (CR6R7)y where R6 is hydrogen and R7 is selected from the group consisting of the side chains of naturally occurring alpha amino acids, or R6 and R7 are independently selected from the group consisting of hydrogen, (C2-C6)alkyl, aryl, aralkyl, heteroaryl, hydroxy, (C1-C6)alkoxy, (C1-C6)alkylthio, NR3R4 and SR5, where R3 and R4 are as defined above, and R5 is hydrogen, (C1-C6)alkyl, hydroxy-, alkoxy-, or alkylthio-substituted (C1-C6)alkyl, or R6 and R7 taken together complete an alicyclic or heterocyclic system;
each of D1-Dn is (CR6R7) where R6 and R7 are as defined above;
each of y and z is zero or an integer from 1 to 10, the sum y+z being greater than 2 but not more than 10;
each of G1-Gnxe2x88x921 is xe2x80x94NR3COxe2x80x94, xe2x80x94NR3CSxe2x80x94, xe2x80x94NR3SOxe2x80x94 or xe2x80x94NR3SO2xe2x80x94, in either orientation, where R3 is as defined above;
each pair of A1-An and B1-Bn are selected such that:
(a) A is a group of formula (IIa), (IIb) or (IIc) and B is N or R3N+; or
(b) A is a group of formula (IId) and B is CH; 
xe2x80x83where:
X is O, S, Se, NR3, CH2 or C(CH3)2;
Y is a single bond, O, S or NR4;
each of p and q is zero or an integer from 1 to 5, the sum p+q being not more than 10;
each of r and s is zero or an integer from 1 to 5, the sum r+s being not more than 10;
each R1 and R2 is independently selected from the group consisting of hydrogen, (C1-C4)alkyl which may be hydroxy- or alkoxy- or alkylthio-substituted, hydroxy, alkoxy, alkylthio, amino and halogen;
each of G1-Gnxe2x88x921 is xe2x80x94NR3COxe2x80x94, xe2x80x94NR3CSxe2x80x94, xe2x80x94NR3SOxe2x80x94 or xe2x80x94NR3SO2xe2x80x94, in either orientation, where R3 is as defined above;
Q is xe2x80x94CO2H, xe2x80x94CONRxe2x80x2Rxe2x80x3, xe2x80x94SO3H or xe2x80x94SO2NRxe2x80x2Rxe2x80x3 or an activated derivative of xe2x80x94CO2H or xe2x80x94SO3H; and
I is xe2x80x94NHRxe2x80x2xe2x80x3Rxe2x80x3xe2x80x3 or xe2x80x94NRxe2x80x2xe2x80x3C(O)Rxe2x80x3xe2x80x3, where Rxe2x80x2, Rxe2x80x3, Rxe2x80x2xe2x80x3 and Rxe2x80x3xe2x80x3 are independently selected from the group consisting of hydrogen, alkyl, amino protecting groups, reporter ligands, intercalators, chelators, peptides, proteins, carbohydrates, lipids, steroids, nucleosides, nucleotides, nucleotide diphosphates, nucleotide triphosphates, oligonucleotides, oligonucleosides and soluble and non-soluble polymers.
In certain embodiments, at least one A is a group of formula (IIc) and B is N or R3N+. In other embodiments, A is a group of formula (IIa) or (IIb), B is N or R3N+, and at least one of y or z is not 1 or 2.
Preferred peptide nucleic acids have general formula (IIIa)-(IIIc): 
wherein:
each L is independently selected from the group consisting of hydrogen, phenyl, heterocyclic moieties, naturally occurring nucleobases, and non-naturally occurring nucleobases;
each R7xe2x80x2 is independently selected from the group consisting of hydrogen and the side chains of naturally occurring alpha amino acids;
n is an integer from 1 to 60;
each of k, l, and m is independently zero or an integer from 1 to 5;
p is zero or 1;
Rh is OH, NH2 or xe2x80x94NHLysNH2; and
Ri is H or COCH3.
Particularly preferred are compounds having formula (IIIa)-(IIIc) wherein each L is independently selected from the group consisting of the nucleobases thymine (T), adenine (A), cytosine (C), guanine (G) and uracil (U), k and m are zero or 1, and n is an integer from 1 to 30, in particular from 4 to 20.
The peptide nucleic acids of the invention are synthesized by adaptation of standard peptide synthesis procedures, either in solution or on a solid phase. The synthons used are monomer amino acids or their activated derivatives, protected by standard protecting groups. The PNAs also can be synthesized by using the corresponding diacids and diamines.
Thus, the novel monomer synthons according to the invention are selected from the group consisting of amino acids, diacids and diamines having general formulae: 
wherein L, A, B, C and D are as defined above, except that any amino groups therein may be protected by amino protecting groups; E is COOH, CSOH, SOOH, SO2OH or an activated derivative thereof; and F is NHR3 or NPgR3, where R3 is as defined above and Pg is an amino protecting group.
Preferred monomer synthons according to the invention have formula (VIIIa)-(VIIIc): 
or amino-protected and/or acid terminal activated derivatives thereof, wherein L is selected from the group consisting of hydrogen, phenyl, heterocyclic moieties, naturally occurring nucleobases, and non-naturally occurring nucleobases; and R7xe2x80x2 is selected from the group consisting of hydrogen and the side chains of naturally occurring alpha amino acids.
These compounds are able to recognize one another to produce double helices. Such recognition can span sequences 5-60 base pairs long. Sequences between 10 and 20 bases are of interest since this is the range within which unique DNA sequences of prokaryotes and eukaryotes are found. Sequences between 17-18 bases are of particular interest since this is the length of unique sequences in the human genome.
Thus, in one aspect, the present invention provides methods for modulating the activity of a transcription factor in a cell, comprising the steps of forming a PNA-containing double strand that binds the transcription factor and introducing the double strand into the cell.
Further, the invention provides methods for modulating the activity of a protein in a cell, comprising the steps of forming a PNA-containing double strand that binds to or suppresses expression of the protein and introducing the double strand into the cell.
The PNA duplex structures of the invention mimic dsDNA and can be used in diagnostics, therapeutics and as research reagents and kits. They can be used in pharmaceutical compositions by including a suitable pharmaceutically acceptable diluent or carrier.