This invention is directed to compounds that are not polynucleotides yet which bind to complementary DNA and RNA strands more strongly the corresponding DNA. In particular, the invention concerns compounds wherein naturally-occurring nucleobases or other nucleobase-binding moieties are covalently bound to a polyamide backbone.
Oligodeoxyribonucleotides as long as 100 base pairs (bp) are routinely synthesized by solid phase methods using commercially available, fully automatic synthesis machines. The chemical synthesis of oligoribonucleotides, however, is far less routine. Oligoribonucleotides also are much less stable than oligodeoxyribonucleotides, a fact which has contributed to the more prevalent use of oligodeoxyribonucleotides in medical and biological research directed to, for example, gene therapy or the regulation of transcription or translation.
The function of a gene starts by transcription of its information to a messenger RNA (mRNA) which, by interaction with the ribosomal complex, directs the synthesis of a protein coded for by its sequence. The synthetic process is known as translation. Translation requires the presence of various co-factors and building blocks, the amino acids, and their transfer RNAs (tRNA), all of which are present in normal cells. Transcription initiation requires specific recognition of a promoter DNA sequence by the RNA-synthesizing enzyme, RNA polymerase. In many cases in prokaryotic cells, and probably in all cases in eukaryotic cells, this recognition is preceded by sequence-specific binding of a protein transcription factor to the promoter. Other proteins which bind to the promoter, but whose binding prohibits action of RNA polymerase, are known as repressors. Thus, gene activation typically is regulated positively by transcription factors and negatively by repressors.
Most conventional drugs function by interaction with and modulation of one or more targeted endogenous proteins, e.g., enzymes. Such drugs, however, typically are not specific for targeted proteins but interact with other proteins as well. Thus, a relatively large dose of drug must be used to effectively modulate a targeted protein. Typical daily doses of drugs are from 10xe2x88x925-10xe2x88x921 millimoles per kilogram of body weight or 10xe2x88x923-10 millimoles for a 100 kilogram person. If this modulation instead could be effected by interaction with and inactivation of mRNA, a dramatic reduction in the necessary amount of drug necessary could likely be achieved, along with a corresponding reduction in side effects. Further reductions could be effected if such interaction could be rendered site-specific. Given that a functioning gene continually produces mRNA, it would thus be even more advantageous if gene transcription could be arrested in its entirety.
Oligodeoxynucleotides offer such opportunities. For example, synthetic oligodeoxynucleotides could be used as antisense probes to block and eventually lead to the breakdown of mRNA. Thus, synthetic DNA could suppress translation in vivo. It also may be possible to modulate the genome of an animal by, for example, triple helix formation using oligonucleotides or other DNA recognizing agents. However, there are a number of drawbacks associated with triple helix formation. For example, it can only be used for homopurine sequences and it requires unphysiologically high ionic strength and low pH.
Furthermore, unmodified oligonucleotides are unpractical both in the antisense approach and in the triple helix approach because they have short in vivo half-lives, they are difficult to prepare in more than milligram quantities and, thus, are prohibitively costly, and they are poor cell membrane penetrators.
These problems have resulted in an extensive search for improvements and alternatives. For example, the problems arising in connection with double-stranded DNA (dsDNA) recognition through triple helix formation have been diminished by a clever xe2x80x9cswitch backxe2x80x9d chemical linking whereby a sequence of polypurine on one strand is recognized, and by xe2x80x9cswitching backxe2x80x9d, a homopurine sequence on the other strand can be recognized. See, e.g., McCurdy, Moulds, and Froehler, Nucleosides, in press. Also, good helix formation has been obtained by using artificial bases, thereby improving binding conditions with regard to ionic strength and pH.
In order to improve half life as well as membrane penetration, a large number of variations in polynucleotide backbones has been undertaken, although so far not with desired results. These variations include the use of methylphosphonates, monothiophosphates, dithiophosphates, phosphoramidates, phosphate esters, bridged phosphoroamidates, bridged phosphorothioates, bridged methylenephosphonates, dephospho internucleotide analogs with siloxane bridges, carbonate bridges, carboxymethyl ester bridges, acetamide bridges, carbamate bridges, thioether, sulfoxy, sulfono bridges, various xe2x80x9cplasticxe2x80x9d DNAs, xcex1-anomeric bridges, and borane derivatives.
International patent application WO 86/05518 broadly claims a polymeric composition effective to bind to a single-stranded polynucleotide containing a target sequence of bases. The composition is said to comprise non-homopolymeric, substantially stereoregular polymer molecules of the form:
R1R2R3Rn
Bxcx9cBxcx9cBxcx9c . . . B,
where:
(a) R1-Rn are recognition moieties selected from purine, purine-like, pyrimidine, and pyrimidine like heterocycles effective to bind by Watson/Crick pairing to corresponding, in-sequence bases in the target sequence;
(b) n is such that the total number of Watson/Crick hydrogen bonds formed between a polymer molecule and target sequence is at least about 15;
(c) Bxcx9cB are backbone moieties joined predominantly by chemically stable, substantially uncharged, predominantly achiral linkages;
(d) the backbone moiety length ranges from 5 to 7 atoms if the backbone moieties have a cyclic structure, and ranges from 4 to 6 atoms if the backbone moieties have an acyclic structure; and
(e) the backbone moieties support the recognition moieties at position which allow Watson/Crick base pairing between the recognition moieties and the corresponding, in-sequence bases of the target sequence.
According to WO 86/05518, the recognition moieties are various natural nucleobases and nucleobase-analogs and the backbone moieties are either cyclic backbone moieties comprising furan or morpholine rings or acyclic backbone moieties of the following forms: 
where E is xe2x80x94COxe2x80x94 or xe2x80x94SO2xe2x80x94. The specification of the application provides general descriptions for the synthesis of subunits, for backbone coupling reactions, and for polymer assembly strategies. However, the specification provides no example wherein a claimed compound or structure is actually prepared. Although WO 86/05518 indicates that the claimed polymer compositions can bind target sequences and, as a result, have possible diagnostic and therapeutic applications, the application contains no data relating to the binding affinity of a claimed polymer.
International patent application WO 86/05519 claims diagnostic reagents and systems that comprise polymers described in WO 86/05518, but attached to a solid support. WO 86/05519 also provides no examples concerning actually preparation of a claimed diagnostic reagent, much less data showing the diagnostic efficiency of such a reagent.
International patent application WO 89/12060 claims various building blocks for synthesizing oligonucleotide analogs, as well as oligonucleotide analogs formed by joining such building blocks in a defined sequence. The building blocks may be either xe2x80x9crigidxe2x80x9d (containing a ring) or xe2x80x9cflexiblexe2x80x9d (lacking a ring). In both cases the building blocks contain a hydroxy group and a mercapto group, through which the building blocks are said to join to form oligonucleotide analogs. The linking moiety in the oligonucleotide analogs is selected from the group consisting of sulfide (xe2x80x94Sxe2x80x94), sulfoxide (xe2x80x94SOxe2x80x94), and sulfone (xe2x80x94SO2xe2x80x94). WO 89/12060 provides a general description concerning synthesis of the building blocks and coupling reactions for the synthesis of oligonucleotide analogs, along with experimental examples describing the preparation of building blocks. However, the application provides no examples directed to the preparation of a claimed oligonucleotide analog and no data confirming the specific binding of an oligonucleotide analog to a target oligonucleotide.
Furthermore, oligonucleotides or their derivatives have been linked to intercalators in order to improve binding, to polylysine or other basic groups in order to improve binding both to double-stranded and single-stranded DNA, and to peptides in order to improve membrane penetration. However, such linking has not resulted in satisfactory binding for either double-stranded or single-stranded DNA. Other problems which resulted from, for example, methylphosphonates and monothiophosphates were the occurrence of chirality, insufficient synthetic yield or difficulties in performing solid phase assisted syntheses.
In most cases only a few of these modifications could be used. Even then, only short sequencesxe2x80x94often only dimersxe2x80x94or monomers could be generated. Furthermore, the oligomers actually produced have rarely been shown to bind to DNA or RNA orhave not been examined biologically.
The great majority of these backbone modifications led to decreased stability for hybrids formed between the modified oligonucleotide and its complementary native oligonucleotide, as assayed by measuring Tm values. Consequently, it is generally understood in the art that backbone modifications destabilize such hybrids, i.e., result in lower Tm values, and should be kept to a minimum.
It is one object of the present invention to provide compounds that bind ssDNA and RNA strands to form stable hybrids therewith.
It is a further object of the invention to provide compounds that bind ssDNA and RNA strands more strongly the corresponding DNA.
It is another object to provide compounds wherein naturally-occurring nucleobases or other nucleobase-binding moieties are covalently bound to a peptide backbone.
It is yet another object to provide compounds other than RNA that can bind one strand of a double-stranded polynucleotide, thereby displacing the other strand.
It is still another object to provide therapeutic and prophylactic methods that employ such compounds.
The present invention provides a novel class of compounds, known as peptide nucleic acids (PNAs), that bind complementary ssDNA and RNA strands more strongly than a corresponding DNA. The compounds of the invention generally comprise ligands linked to a peptide backbone via an aza nitrogen. Representative ligands include either the four main naturally occurring DNA bases (i.e., thymine, cytosine, adenine or guanine) or other naturally occurring nucleobases (e.g., inosine, uracil, 5-methylcytosine or thiouracil) or artificial bases (e.g., bromothymine, azaadenines or azaguanines, etc.) attached to a peptide backbone through a suitable linker.
In certain preferred embodiments, the peptide nucleic acids of the invention have the general formula (I): 
wherein:
n is at least 2,
each of L1-Ln is independently selected from the group consisting of hydrogen, hydroxy, (C1-C4) alkanoyl, naturally occurring nucleobases, non-naturally occurring nucleobases, aromatic moieties, DNA intercalators, nucleobase-binding groups, heterocyclic moieties, and reporter ligands, at least one of L1-Ln being a naturally occurring nucleobase, a non-naturally occurring nucleobase, a DNA intercalator, or a nucleobase-binding group;
each of A1-An is a single bond, a methylene group or a group of formula (IIa) or (IIb): 
xe2x80x83where:
X is O, S, Se, NR3, CH2 or C (CH3)2;
Y is a single bond, O, S or NR4;
each of p and q is zero or an integer from 1 to 5, the sum p+q being not more than 10;
each of r and s is zero or an integer from 1 to 5, the sum r+s being not more than 10;
each R1 and R2 is independently selected from the group consisting of hydrogen, (C1-C4)alkyl which may be hydroxy- or alkoxy- or alkylthio-substituted, hydroxy, alkoxy, alkylthio, amino and halogen; and
each R3 and R4 is independently selected from the group consisting of hydrogen, (C1-C4)alkyl, hydroxy- or alkoxy- or alkylthio-substituted (C1-C4)alkyl, hydroxy, alkoxy, alkylthio and amino;
each of B1-Bn is N or R3N+, where R3 is as defined above;
each of T1-Tn is CR6R7, CHR6CHR7 or CR6R7CH2, where R6 is hydrogen and R7 is selected from the group consisting of the side chains of naturally occurring alpha amino acids, or R6 and R7 are independently selected from the group consisting of hydrogen, (C2-C6)alkyl, aryl, aralkyl, heteroaryl, hydroxy, (C1-C6)alkoxy, (C1-C6)alkylthio, NR3R4 and SR5, where R3 and R4 are as defined above, and R5 is hydrogen, (C1-C6)alkyl, hydroxy-, alkoxy-, or alkylthio-substituted (C1-C6)alkyl, or R6 and R7 taken together complete an alicyclic or heterocyclic system;
each of D1-Dn is CR6R7, CH2CR6R7 or CHR6CHR7, where R6 and R7 are as defined above;
each of G1-Gnxe2x88x921 is xe2x80x94NR3COxe2x80x94, xe2x80x94NR3CSxe2x80x94, xe2x80x94NR3SOxe2x80x94 or xe2x80x94NR3SO2xe2x80x94, Y in either orientation, where R3 is as defined above;
Q is xe2x80x94CO2H, xe2x80x94CONRxe2x80x2Rxe2x80x3, xe2x80x94SO3H or xe2x80x94SO2NRxe2x80x2Rxe2x80x3 or an activated derivative of xe2x80x94CO2H or xe2x80x94SO3H; and
I is xe2x80x94NHRxe2x80x2xe2x80x3Rxe2x80x3xe2x80x3 or xe2x80x94NRxe2x80x2xe2x80x3C(O)Rxe2x80x3xe2x80x3, where Rxe2x80x2, Rxe2x80x3, Rxe2x80x2xe2x80x3 and Rxe2x80x3xe2x80x3 are independently selected from the group consisting of hydrogen, alkyl, amino protecting groups, reporter ligands, intercalators, chelators, peptides, proteins, carbohydrates, lipids, steroids, oligonucleotides and soluble and non-soluble polymers.
The peptide nucleic acids of the invention differ from those disclosed in WO 86/05518 in that their recognition moieties are attached to an aza nitrogen atom in the backbone, rather than to an amide nitrogen atom, a hydrazine moiety or a carbon atom in the backbone.
Preferred peptide nucleic acids have general formula (III): 
wherein:
each L is independently selected from the group consisting of hydrogen, phenyl, heterocyclic moieties, naturally occurring nucleobases, and non-naturally occurring nucleobases;
each R7xe2x80x2 is independently selected from the group consisting of hydrogen and the side chains of naturally occurring alpha amino acids;
n is an integer from 1 to 60;
each of k, l and m is independently zero or an integer from 1 to 5;
Rh is OH, NH2 or xe2x80x94NHLysNH2; and
Ri is H or COCH3.
Particularly preferred are compounds having formula (III) wherein each L is independently selected from the group consisting of the nucleobases thymine (T), adenine (A), cytosine (C), guanine (G) and uracil (U), k and m are zero or 1, and n is an integer from 1 to 30, in particular from 4 to 20. An example of such a compound is provided in FIG. 1, which shows the structural similarity between such compounds and single-stranded DNA.
The peptide nucleic acids of the invention are synthesized by adaptation of standard peptide synthesis procedures, either in solution or on a solid phase. The synthons used are specially designed monomer amino acids or their activated derivatives, protected by standard protecting groups. The oligonucleotide analogs also can be synthesized by using the corresponding diacids and diamines.
Thus, the novel monomer synthons according to the invention are selected from the group consisting of amino acids, diacids and diamines having general formulae: 
wherein L, A, B, T and D are as defined above, except that any amino groups therein may be protected by amino protecting groups; E is COOH, CSOH, SOOH, SO2OH or an activated derivative thereof; and G is NHR3 or NPgR3, where R3 is as defined above and Pg is an amino protecting group.
Preferred monomer synthons according to the invention are amino acids having formula (VII): 
or amino-protected and/or acid terminal activated derivatives thereof, wherein L is selected from the group consisting of hydrogen, phenyl, heterocyclic moieties, naturally occurring nucleobases, non-naturally occurring nucleobases, and protected derivatives thereof; and R7xe2x80x2 is independently selected from the group consisting of hydrogen and the side chains of naturally occurring alpha amino acids. Especially preferred are such synthons having formula (VII) wherein R7xe2x80x2 is hydrogen and L is selected from the group consisting of the nucleobases thymine (T), adenine (A), cytosine (C), guanine (G) and uracil (U) and protected derivatives thereof.
Unexpectedly, these compounds also are able to recognize duplex DNA by displacing one strand, thereby presumably generating a double helix with the other one. Such recognition can take place to dsDNA sequences 5-60 base pairs long. Sequences between 10 and 20 bases are of interest since this is the range within which unique DNA sequences of prokaryotes and eukaryotes are found. Reagents which recognize 17-18 bases are of particular interest since this is the length of unique sequences in the human genome. The compounds of the invention also should be able to form triple helices with dsDNA.
Whereas the improved binding of the compounds of the invention should render them efficient as antisense agents, it is expected that an extended range of related reagents may cause strand displacement, now that this surprising and unexpected new behavior of dsDNA has been discovered.
Thus, in one aspect, the present invention provides methods for inhibiting the expression of particular genes in the cells of an organism, comprising administering to said organism a reagent as defined above which binds specifically to sequences of said genes.
Further, the invention provides methods for inhibiting transcription and/or replication of particular genes or for inducing degradation of particular regions of double stranded DNA in cells of an organism by administering to said organism a reagent as defined above.
Still further, the invention provides methods for killing cells or virus by contacting said cells or virus with a reagent as defined above which binds specifically to sequences of the genome of said cells or virus.