Reagents that selectively bind to DNA or RNA are of significant interest in molecular biology and medicinal chemistry as they may be developed into gene-targeted drugs for diagnostic and therapeutic applications and may be used as tools for sequence-specific modification of DNA. To date, research directed at identifying such reagents has focused primarily upon development of various oligonucleotides and their close analogs having modified backbones, such as, phosphorothioate or methyl phosphonate backbones instead of the natural phosphodiester backbone. These reagents, however, have been found to have serious shortcomings, especially with respect to stability against biological degradation, solubility, cellular uptake properties and ease of synthesis. For these reasons, alternative concepts for oligonucleotide mimics have been attracting interest.
Peptide nucleic acids (PNAs) are a recently developed class of oligonucleotide mimics wherein the entire deoxyribose phosphate backbone has been replaced by a chemically different, structurally homomorphous backbone composed of (2-aminoethyl)glycine units. Despite this dramatic change in chemical makeup, PNAs recognize complementary DNA and RNA by Watson-Crick base pairing. Furthermore, PNAs have been shown to have numerous advantages over DNA and RNA oligomers. For example, PNAs lack 3xe2x80x2 to 5xe2x80x2 polarity and thus can bind in either a parallel or an antiparallel orientation to DNA or RNA (Egholm, M. et al., Nature 365:566, 1993). It has been demonstrated that PNAs can bind double-stranded DNA by invading the DNA duplex and displacing one strand to form a stable D-loop structure (Peffer et al., Proc. Natl. Acad. Sci. USA 90:10648, 1993). A further advantage of PNAs is that they are less susceptible to enzymatic degradation (Demidov et al. Biochem. Pharmacol. 48:1310, 1994) and bind RNA with higher affinity than analogous DNA oligomers (Norton et al. Nature Biotechnology 14:615, 1996). Quite advantageously, selective hybridization of PNA to DNA is less tolerant of base pair mismatches than DNA-DNA hybridization. For example, a single base mismatch within a 16 bp PNA-DNA duplex can reduce the Tm by up to 15xc2x0 C., compared to 10xc2x0 C. in the case of a 16 bp DNA-DNA duplex (Egholm, M. et al. Nature 365:566, 1993). Finally, in at least one example, a PNA molecule has been shown capable of mimicing a transcription factor and acting as a promoter, thus demonstrating the potential use of PNAs as gene-specific activating agents (Mollegaard et al. Proc Natl Acad Sci USA 91:3892, 1994).
The success of an oligonucleotide analog as an antisense drug requires that the oligonucleotide be taken up by cells in large enough quantities to reach its target at a concentration sufficient to cause the desired effect. Until recently PNAs have shown low phospholipid membrane permeability (Wittung et al. FEBS Letters 365:27, 1995) and have been reported to be taken up by cells very poorly (Hanvey et al. Science 258:1481, 1992; Nielsen et al. Bioconiugate Chem. 5:3, 1994; Bonham et al. Nucleic Acid Res. 23:1197, 1995), initially suggesting their potential use as anti-gene and anti-sense agents would be quite limited.
Strategies to improve the cellular uptake of PNAs by conjugating the PNA sequence to a carrier molecule have met with some limited success (Basu et al. Bioconiugate Chem. 8:481, 1997). Conjugation of PNA molecules to receptor ligand molecules has increased cellular uptake of the PNA; however, the ability of these receptor ligand-conjugated PNA oligomers to influence biological activity once inside the target cells remains unproven. Further, using such a conjugation strategy permits the PNA oligomers to enter only those cells expressing the particular targeted receptor. Thus, an appropriate ligand molecule would have to be designed for each cell type of interest.
However, recently it has been discovered that unconjugated (aka xe2x80x9cnakedxe2x80x9d) PNA oligomers administered extracellularly can both cross cell membranes (Gray, G. D. Biochem. Pharmacol. 53:1465, 1997) and elicit a sequence-specific biological response in living cells (Richelson, E. FEBS Letters 421:280, 1998). Thus, PNAs possess the following characteristics suggesting they are well suited as therapeutic and diagnostic candidates: cell permeability in vivo; higher specificity and stronger binding to its complementary DNA or RNA than oligonucleotides or their analogs; resistance to enzymes like nucleases and proteases thereby showing long biological half-life; chemical stability over a wide pH range; no action as a primer; and an ability to act as a gene promoter.
Improvements in genomic research have increased the rate of generation of information on the identity, structure and function of a number of human genes, thereby producing a diverse group of novel molecular targets for therapeutic and diagnostic applications. However, gene sequencing and characterization is still a slow and often arduous process, as evidenced by the fact that, to date, only a fraction of the entire human genome has been sequenced. The same advantageous binding and chemical stability properties that make PNAs useful as therapeutics and diagnostics also suggest such compounds will be useful in determining the sequence, structure and/or function of DNA and RNA.
In addition to completely characterizing a gene, the tasks of unraveling the details of the interactions of the gene with its DNA binding proteins and determining the mechanisms whereby such proteins mediate gene expression, replication and transduction of the gene require a great deal of time and effort. Further, understanding the genetic malfunctions of dysfunctional genes that cause the many complex genetic disorders found in man still requires extensive research. Thus, here too, PNAs can be useful.
While PNAs appear to be particularly well-suited for use as diagnostics, therapeutics and/or research tools, identification of appropriate PNAS for a specific purpose can be difficult, time consuming and expensive. For example, identifying which region of a gene should be targeted in order to provide a desired effect, such as blocking transcription thereof, or which region, if any, may be activated to promote transcription thereof, generally requires sequencing most, if not all, of the gene and then testing various PNA fragments complementary thereto.
Recently, combinatorial libraries of random-sequence oligonucleotides, polypeptides and/or synthetic oligomers have been employed to facilitate the isolation and identification of compounds capable of producing a desired biological effect or useful as diagnostics. Compounds so identified may mimic or block natural ligands, may interfere with the natural interactions of the target molecule or may simply be useful as tools for designing and developing other molecules with more desirable properties.
Combinatorial libraries useful in this general application may be formed by a variety of solution-phase or solid-phase methods in which mixtures of different subunits are added in a stepwise manner to growing oligomers, until a desired oligomer size is reached. Alternatively, the library may be formed by solid-phase synthetic methods in which beads containing different sequence oligomers that form the library are alternately mixed and separated with one of a selected number of subunits being added to each group of separated beads at each step. An advantage of this method is that each bead contains only one oligomer species, allowing the beads themselves to be used for oligomer screening (Furka, et al., Int. J. Pept. Protein Res. 37:487-493 (1991); Sebestyen, et al., Bioorg. Med. Chem. Letter 3:413-418 (1993).)
Still another approach that has been proposed involves the synthesis of a combinatorial library on spatially segregated arrays (see, Fodor, et al., Science, 251:767-773, 1991). This approach has generally been limited in the number of different library sequences that can be generated.
Because the chance of finding useful ligands increases with the size of the combinatorial library, it is desirable to generate libraries composed of large numbers of different sequence oligomers. For example, in the case of oligonucleotides or oligonucleotide mimics, such as PNAs, a library having a 4-base variability and 8 oligomer residue positions (octamer) will contain 48 (65,536) different sequences to be a complete (universal) library. In the case of a 10 oligomer residue position (decamer) PNA or oligonucleotide universal library, 1,048,576 different sequences must be synthesized.
Because each different-sequence species in a large number library may be present in small amounts, one of the challenges in the combinatorial library screening procedure is isolating and determining the sequence(s) of species that have the desired binding or other selected properties. Thus, not only must the library be universal but the method(s) selected for screening that library must be tailored to distinguish active from non-active species, considering the small amount of each species that is available.
Current methodologies for the synthesis of peptide nucleic acids involve the stepwise addition of suitably protected PNA monomers via one of two standard synthesis protocols. This work has been described in detail in a number of recent papers, including; Dueholm, et al., J. Org. Chem., 59;5767 (1994); Thomson, et al., Tetrahedron, 51:6179 (1995); Will, et al. Tetrahedron, 51:12069 (1995); Breipohl, et al., Biorg. Med. Chem. Lett. 6:665 (1996); Koch, et al., J. Peptide Res., 49:80 (1997); Jordan, S., Bioorg. Med. Chem. Lett., 7:681 (1997); Breipohl, et al., Tetrahedron, 53:14671 (1997). More specifically, such methodologies for the synthesis of PNAs involve the stepwise addition of suitably protected PNA monomers generally using one of two standard synthesis protocols. These protocols are based upon the particular protecting group strategy that is used, i.e. either the Fmoc or the Boc-protecting group. The Boc-protecting group strategy requires the use of harsher chemicals and is described in detail in the aforementioned literature. When PNAs are synthesized using the Fmoc-protecting group, typically a solid-phase synthesis resin, for example, a paramethylbenzhydrylamine resin, is used, and the first Fmoc-protected monomer is reacted with the resin using an activating agent, such as HATU, in the presence of a base, such as diisopropylethylamine, and lutidine. After coupling, the reaction mixture is filtered or drained, and the resin is washed. Then unreacted amino groups are generally capped with acetic anhydride to prevent further reaction at those sites. After again washing, the resin-bound monomer is deprotected for the next step of the synthesis by removing the Fmoc-protecting agent using piperidine or the like. Thereafter, this cycle is simply repeated until an oligomer chain of the desired length is obtained, and then cleavage from the resin is effected.
In summary, what is needed are improved, more efficient methods for synthesizing PNAs which can be used to construct such libraries, as well as a system of improved techniques and tools for the rapid identification of agents, particularly PNAs which would be useful in characterizing genes and discovering potential therapeutics and/or diagnostics. In addition to improved PNA synthesis, it would be desirable to be able to efficiently construct libraries which would have the advantages of specificity associated with a large species library, such as an octameric or decameric library, without having to synthesize tens of thousands or millions of different-sequence species. It would also be desirable to have rapid screening methods that may quickly identify xe2x80x9cbest candidatesxe2x80x9d from the library for further testing and/or development, and it would be desirable to have a means for determining optimal PNAs, both with respect to sequence base identity and length for use as therapeutics, diagnostics and/or research tools.
The present invention addresses the foregoing and other needs by providing a novel and efficient method for synthesizing activated, protected cyclic intermediates that provide PNA monomers. This novel synthesis adds a solution of a haloacetic acid equivalent to a solution of ethylene diamine and heats to form a cyclic piperazinone. A further reaction is then carried out to add a desired nucleotide base to create cyclic piperazinone intermediates via an amide bond formed with the secondary amino moiety in the piperazinone ring. The remaining amido group in these reaction products is protected with a suitable protecting group to create a piperazinone intermediate that is readily hydrolizable to a peptide nucleic acid monomer and that is readily reactive to form desired derivatives or dimers or other oligomers. The conditions used may also add protection to an unprotected moiety on the base. This novel method provides a rapid, cost-effective and simple approach to the preparation of PNAs using chemical materials that are inexpensive and/or easily prepared.
The invention also provides an overall system involving three key integrated components which permits the rapid identification and/or design of PNAs capable of site-specific recognition of target nucleotide sequences and therefore useful as therapeutics, diagnostics and/or gene characterization tools. One component of this system is a universal PNA library that may be easily and efficiently synthesized and that most preferably has the screening ability of a large library, such as an octameric library, yet does not require synthesis of a large number of individual species. Another component of the overall system is a high throughput screening system, termed the Universal PNA Identification (UPID(trademark)) System, that includes a number of assays designed to provide information on the binding activities of PNAs of different sequences relative to the target nucleotide sequence. A third component is a software system especially designed to provide rapid analysis of the data collected from the (UPID(trademark)) System and to identify the sequence base identities and lengths of optimal PNA oligomers therefrom.
In one aspect, such a universal library would incorporate universal nucleotide bases into each of the species in order to increase its screening capability without the need for an unmanageable number of individual species. Such a universal library could then be subjected to such an improved high throughput screening process in order to identify novel regulators acting by specific modulation of a selected gene, such as one implicated in a human disease. Optimization of these novel regulators would be guided by such a software system, so as to be capable of predicting the most appropriate therapeutic and/or diagnostic candidates, both in sequence length and sequence base identity. As a result, the structural/functional characterization of newly discovered genes should be enabled, as well as the identification of genomic mutations, such as single nucleotide polymorphisms, in either genomic DNA or PCR-amplified DNA, which would permit genetic diagnosis of disease states as well as the rapid screening of at-risk populations.