Cancers arise due to mutations or dysregulation of genes involved in DNA replication and repair, cell cycle control, anchorage independent growth, angiogenesis, apoptosis, tissue invasion, and metastasis (Hanahan, D. et al., Cell 100(1): 57-70 (2000)). These processes are controlled by networks of genes in the p53, cell cycle, apoptosis, Wnt signaling, RPTK signaling, and TGF-beta signaling pathways. Such genes and their protein products are the targets of many current and developing therapies.
Signaling pathways are used by cells to generate biological responses to external or internal stimuli. A few thousand gene products control both ontogeny/development of higher organisms and sophisticated behavior by their many different cell types. These gene products work in different combinations to achieve their goals, and do so through protein-protein interactions. The evolutionary architecture of such proteins is through modular protein domains that recognize and/or modify certain motifs. For example, different tyrosine kinases (such as Abl) will add phosphate groups to specific tyrosines embedded in particular peptide sequences, while other enzymes (such as PTEN) act as phosphatases to remove certain signals. Proteins and other macromolecules may also be modified through methylation, acetylation, sumolation, ubiquitination, and these signals in turn are recognized by specific domains that activate the next step in the pathway. Such pathways usually are initiated through signals to receptors on the surface, which move to intracellular protein interactions and often lead to signaling through transcription factor interactions that regulate gene transcription. For example, in the Wnt pathway, Wnt interacts with the Frizzled receptor, signaling through Disheveled, which inhibits the Axin-APC-GSK3 complex, which binds to beta-catenin to inhibit the combination of beta-catenin with TCF4, translocation of this complex into the nucleus, and activation of Myc, Cyclin D, and other oncogenic protein transcription (Polakis, P. et al., Genes Dev 14(15):1837-1851 (2000); Nelson, W. J. et al., Science 303(5663): 1483-1487 (2004)). Signaling may also proceed from the nucleus to secreted factors such as chemokines and cytokines (Charo, I. F. et al., N Engl J Med 354(6):610-621 (2006)). Protein-protein and protein-nucleic acid recognition often work through protein interactions domains, such as the SH2, SH3, and PDZ domains. Currently, there are over 75 such motifs reported in the literature (Hunter, et. al., Cell 100:113-127 (2000); Pawson et. al., Genes & Development 14:1027-1047 (2000)). These protein-interaction domains comprise a rich opportunity for developing targeted therapies.
Other macromolecular interactions that can serve as potential targets include protein-nucleic acid interactions, protein-carbohydrate interactions and protein-lipid interactions. Protein-nucleic acid interactions of interest are the interactions between ribosomal proteins and nucleic acids involved in protein synthesis, especially protein synthesis in bacterial pathogens (Franceschi F et al, Biochem Pharmacol, 71 (7): 1016-1025 (2006)). Interactions between transcription factors and nucleic acids sequences, such as those in promoter regions may also be targets for therapies (Gniazdowski M, et al., Curr Med. Chem., 10(11):909-24 (2003)).
Lectins and other carbohydrate binding proteins are involved in many cellular processes, including trafficking and clearing of glycoproteins, cell adhesion, glycosylation, immune response, apoptosis and tumor genesis. Sugars generally bind to proteins weakly in shallow grooves close to the surface of the protein, with binding affinities in the mM to μM range. The sugar binding sites on proteins that are essential for microorganism pathogenesis may serve as targets for therapy (Ziolkowska N et al, Structure 14:1127-1135 (2006)).
Protein-lipid interactions are most common in membrane proteins where the protein function is directly shaped by interactions with membrane lipids. These interaction are key components in sensory and signaling pathways (Phillips R et al, Nature 459:379-385 (2009)) and may serve as therapeutic targets.
Cancer therapies may be divided into two classical groups: (i) small molecule drugs such as Gleevec that bind into a compact pocket, and (ii) antibody therapeutics such as herceptin which binds and inhibits the HER-2/neu member of the epidermal growth factor receptor (EGFR) family. Antibody and protein therapeutics work by binding over an extended area of the target protein. Antibodies fight cancers by inducing apoptosis, interfering with ligand-receptor interactions, or preventing expression of proteins required for tumor growth (Mehren et al., Ann Rev. Med. 54:343-69 (2003)). Additional successful cancer antibody therapeutics include Rituximab, an anti CD20 antibody, Erbitux (cetuximab) targeted to EGFR, and Avastin (bevacizumab) which interferes with vascular endothelial growth factor (VEGF) binding to its receptor (Mehren et al., Ann Rev. Med. 54:343-69 (2003)). Except for the skin rash associated with EGFR receptor antibodies (which ironically correlates with efficacy), antibody therapies are generally well tolerated and do not have the side-effects associated with traditional chemotherapy.
Antibodies achieve their extraordinary specificity through the diversity generated in their complementarity-determining regions (“CDR's”). An IgG antibody binding surface consists of three CDRs from the variable heavy chain paired with three CDRs from the variable light chain domain. Each CDR consists of a loop of around a dozen amino acid residues, whose structure binds to the target surface with nanomolar affinity (Laune, et. al., J. Biol. Chem. 272:30937-30944 (1997); Monnet, et al., J. Biol. Chem. 274:3789-3796 (1999)). Thus, antibodies achieve their specificity by combining multiple weak interactions across a generally flat surface of approximately 1200-3000 Å2. Monoclonal antibodies may be readily generated to most proteins, and artificial antibodies screened for using in vitro phage or bacterial systems (Mehren et al., Ann Rev. Med. 54:343-69 (2003)). Mouse monoclonal antibodies may be “humanized” to reduce development of undesired human antimouse antibodies. Limitations of using monoclonal antibodies include production of anti-idiotypic antibodies, disordered tumor vasculature, increased hydrostatic pressure within tumor, and heterogeneity of surface antigen within tumors. Due to these barriers, it takes 2 days for an IgG antibody to travel 1 mm and 7-8 months to travel 1 cm into a tumor (Mehren et al., Ann Rev. Med. 54:343-69 (2003)). Smaller variations of the IgG motifs have been engineered, including scFv and Affibodies (Eliasson, M. et al., J Immunol 142(2):575-581 (1989); Gunneriusson, E. et al., J Bacteriol 178(5): 1341-1346 (1996); Nord, K. et al., Nat Biotechnol 15(8):772-777 (1997)), and these have improved tumor penetration by cutting down penetration time in about half.
Antibodies can achieve tighter binding and higher specificity than any artificially synthesized therapy. Nevertheless, antibody therapies are limited to interfering with protein-protein interactions or protein receptor activity that are on the surface of tumors or circulating targets, cannot be ingested orally, and are not able to use their extraordinary specificity to inhibit intracellular protein signaling.
On the other end of the spectrum are small molecule drugs. These have the advantages of being orally active, being sufficiently small enough (usually with a molecular weight<750) to diffuse across cell membranes, and binding tightly into compact binding pockets used by all enzymes to bind their substrates (or interfering with macromolecular machinery used in cellular processes) (Landry, Y., et al., Fundam Clin Pharmacol 22(1):1-18 (2008); Duarte, C. D., et al., Mini Rev Med Chem 7(11):1108-1119 (2007); Amyes, T. L., et al., ACS Chem Biol 2(11):711-714 (2007)). Recently, the field of combinatorial chemistry has greatly improved the ability of chemists to identify lead molecules that bind and inhibit specific protein targets (Dolle, et al., J. Combinatorial Chem. 6(5):597-635 (2005)).
Thus, current drug design and drug therapy approaches do not address the urgent need to find drugs which interfere with intracellular protein-protein interactions, or protein signaling. Antibodies have the required specificity to distinguish among closely related protein surfaces, yet are too large to be taken orally or enter cells. Orally active pharmaceuticals are too small (i.e. have a molecular weight less than 750) to disrupt protein-protein surface interactions (generally flat, and over 1200-3000 Å2).
Attempts to identify small molecule drugs that bind over an extended area have mostly been limited to traditional targets containing at least one compact binding site. One approach is based on: (i) preparing a set of potential binding elements where each molecule has a common chemical linkage group; (ii) identifying all binding elements that inhibit even weakly the target enzyme; (iii) preparing a combinatorial library of all the winning binding elements connected by a common chemical linkage group and a series of flexible linkers; and (iv) screening the combinatorial library to identify the tightest binding compound drugs. This approach was used to identify a small molecule inhibitor of the c-Src tyrosine kinase (Maly, et. al., Proc. Nat't Acad. Sci. USA 97: 2419-2424 (2000)) as well as the tyrosylprotein sulfotransferase (Kehoe, et al., BioOrg & Medicinal Chem. Lett. 12:329-332 (2002)). One flaw in this approach is that the initial screen finds mostly molecules that bind within the initial pocket, but the final product needs to have both binding elements bind with high affinity. Thus, the success of the above approach was the result of a fortuitous alternative binding of one of the elements identified in the initial screen. A second disadvantage is the need to screen each of the potential combinatorial library elements individually.
To overcome the limitation of testing various combinations of ligands and connectors individually, Lehn and coworkers developed dynamic combinatorial chemistry (“DCC”) as a new means for drug discovery (Lehn, et. al., Science 291:2331-2332 (2001); Ramstrom, et. al., Nat. Rev. Drug Discovery 1:26-36 (2002)). In this approach, potential ligand molecules form reversible adducts to different bifunctional connector molecules, and these interconnections are in continuous exchange with each other. When the enzyme target is added, the best bound library constituent is selected from all the possible combinations, allowing for identification of the active species. Using 16 hydrazides, 2 monoaldehydes, and 3 dialdehydes, 440 different combinations were formed and selected against the bifunctional B. subtilis HPr. kinase/phosphatase (Bunyapaiboonsri, et. al., J. Med. Chem. 46:5803-5811 (2003)). Improvement in synthesis and spatial identification of specific library members is achieved by using resin-bound DCC approaches (McNaughton, et. al., Organic Letters 8:1803-1806 (2006)).
The use of DNA to encode self-assembling chemical (ESAC) libraries has extended the potential for dynamic combinatorial chemistry drug discovery (Melkko et al., Nature Biotech, 22:568-574 (2004)). The DNA strands are partially complementary to allow for reversible binding to each other under standard incubation conditions and also contain bar codes to identify the ligand element. After using DCC to select for the tightest binding combinations, and identification of ligands based on their DNA code, the ligands are resynthesized with a variety of spacers to identify the tightest binding tethered combinations. This approach was used to find binding molecules with nanomolar affinities to serum albumin, carbonic anhydrase, streptavidin, and trypsin, respectively (Melkko et al., Nature Biotech, 22:568-574 (2004); Dumelin et al., Bioconjugate Chem. 17:366-370 (2006); Melkko et al., Angew. Chem. 46:4671-4674 (2007)). One disadvantage of this approach is the wide footprint of about 15.4 Angstroms introduced by using double-stranded DNA as the dynamic combinatorial chemistry element, separating the ligands by a considerable distance, and requiring a higher MW tether to reestablish tight binding affinities.
In an inversion of the standard small-molecule drug binding within a compact binding pocket in the target enzyme, the macrocycle vancomycin binds to its L-Lys-D-Ala-D-Ala tripeptide target by forming a dimer that surrounds the tripeptide. By using the actual target to accelerate combinatorial synthesis of vancomycin and vancomycin analogue dimers, tethered dimers were isolated with tighter affinities and in vitro activity against some vancomycin resistant bacterial strains (Nicolaou et al., Angew. Chem. 39:3823-3828 (2000)). It is unlikely that these derivatives would be orally active due to their high molecular weight and potential for disulfide dimers to be reduced to monomers within the bloodstream.
Many receptors (for example, the erythropoietin receptor) are activated by ligand-induced homodimerization, which leads to internal cellular signals. By using bi- or multi-functional connectors to link ligand molecules to form dimers, trimers, and tetramer libraries, a number of small molecule agonists could be isolated that assisted in erythropoietin receptor homodimerization (Goldberg et. al., J. Am. Chem. Sec. 124:544-555 (2002)). These molecules demonstrate the ability of multi-ligand drugs to influence protein-protein interactions, in a manner that mimics the natural activity of cytokines and chemokines.
Sharpless and coworkers have identified reactions that occur readily when the constituent chemical linkage groups are brought in close proximity with each other, termed “click chemistry” (Kolb, et. al., Drug Discovery Today 8:1128-1137 (2003)). By adding various ligands connected to these reactive groups (such as an azide on one set of ligands and acetylene on the other ligands) and combining these library compounds in solution in the presence of enzyme targets, highly potent inhibitors form, for example for the acetylcholine esterase or the HIV protease (Kolb et. al., Drug Discovery Today, 8:1128-1137 (2003); Brik et. al., Chem. BioChem 4:1246-1248 (2003); Whiting, et. al., Angew. Chem. Int. Ed. 45:1435-1439 (2006); Lewis et. al., Angew Chem 41:1053-1057 (1002); Bourne et. al., Proc. Nat'l Acad. Sci. USA 101:1449-1454 (2004)). In short, the target enzyme acts as a catalyst for the proximal ligation of its own inhibitor. The advantage of this approach is the enrichment of the best binding compound in a single step.
An elegant approach to finding low molecular weight ligands that bind weakly to targeted sites on proteins was developed by Wells and coworkers (Erlanson et. al., Proc. Nat'l Acad. Sci. USA 97:9367-9372 (2000); Thanos, et. al., J. Am. Chem. Sco. 125:15280-15281 (2003); Erlanson et. al., Nature Biotechnology 21:308-314 (2003); Buck et. al., Proc. Nat'l Acad. Sci. USA 102:2719-2724 (2005)). A native or engineered cysteine in a protein is allowed to react reversibly with a small library of disulfide-containing molecules. The process of dynamic combinatorial chemistry takes place as the most stable molecules are enriched on the surface of the protein target. These are then readily identified by mass spectroscopy, and serve as lead compounds for further modification.
Dynamic combinatorial or “click” chemistry increases yields of appropriate binding ligand combinations, but still requires enzymatic assays. The disadvantages of these approaches are that they are limited to enzymes with one or more deep binding pockets, where knowledge of at least one potential ligand is often needed. Further, the starting blocks are not readily available and require independent synthesis for each pharmacophore or ligand to be tested. The chemical linkage groups used for click chemistry are not suitable for use in vivo as they would react readily and irreversibly with cellular components. The reactions need to take place with sufficient efficiency and at a large enough scale such that the enzyme selected inhibitor is synthesized in sufficient amounts to allow for purification and identification of the correct product. This last constraint limits the number of ligands that may be screened in a single assay, and limits the throughput of these approaches.
Several groups have recognized that macrocycles provide an opportunity for recognition of extended binding motifs within targets. Several of these are orally active, despite having molecular weight beyond the traditional 750 cutoff. These include cyclosporin (molecular weight 1202.64), rapamycin (molecular weight 914.2), tacrolimus (molecular weight 822.03), erythromycin (molecular weight 733.94), azithromycin (molecular weight 748.88), and clarithromycin (molecular weight 747.9). Note that although vancomycin (molecular weight 1485.74) is used orally for treatment of gastrointestinal infections, it is not absorbed into the body. Cyclosporin is the largest of the groups listed above and illustrates a few features common to these drugs. Their cyclic nature reduces entropic loss upon binding and the extended structure allows for enhanced binding. Cyclosporin has torroidal flexibility, allowing it to bring its polar side-chains into the interior so the outside is nonpolar and this allows for transfer across membranes. Likewise, the drug is in structural equilibrium with its polar conformer, allowing for binding to its target.
As promising as macrocycle and synthetic peptide mimetics are for lead drug candidates, it is not trivial to use synthetic chemistry to generate sufficient diversity required for high affinity binding to extended binding sites in target proteins. Two groups have sought to address this issue using DNA encoded approaches with evolutionary selection. In the first approach, a functional group is attached to a long DNA barcode sequence containing multiple zip-codes (Halpin, D. R. et al., PLoS Biol 2(7):E173 (2004); Halpin, D. R. et al., PLoS Biol 2(7):E174 (2004); Halpin, D. R. et al., PLoSBiol 2(7):E175 (2004)). The molecules are equilibrated with a set of columns (e.g., 10 columns), containing beads with complementary zip-code sequences. DNA hybridization captures library members containing the complementary zip-code sequence on their DNA tag. The library members are eluted into separate new chambers and reacted with a bifunctional moiety (for example, a protected amino acid residue) that corresponds to the given zip-code. The library members are then re-pooled, and then rerouted to the next series of columns. This process was repeated through several rounds to generate 106 pentapeptides. After only two rounds of translation, selection with an antibody to the pentapeptide enkephalin, and amplification, the library converged on enkephalin and slight variants. Potential disadvantages of this approach are the need for DNA encryption strands of 200 or more bases. In the second approach, a bifunctional group is attached to a DNA template sequence containing adjacent zipcode sequences (Calderone, C. T. et al., Angew Chem Int Ed Engl 44(45):7383-7386 (2005); Sakurai, K. et al., J Am Chem Soc 127(6):1660-1661 (2005)). The DNA sequence serves as a template for adding bifunctional moieties to one end of the bifunctional group on the DNA tag. Each bifunctional moiety (for example, a protected amino acid residue) is attached to a complementary zip-code DNA molecule, which hybridizes on the DNA template containing the original bifunctional group. This hybridization increases the local concentration of the reactant to such an extent that it can drive synthesis to very high yields. This method does not require split-pooling techniques. If 4 sets of 10 each bifunctional moieties are added, this will result in 10,000 pharmacophores in the library. At the end of the synthesis, the last amino acid residue may be reacted with the other end of the original bifunctional group to create a circular pharmacophore. In this version, the identity of the pharmacophore is defined by the zipcode sequences in the DNA template. It may be identified by PCR amplification and sequencing. Further, the PCR amplicons may serve as starting templates for a new round of translation, selection, and amplification, allowing for application of evolutionary principles to synthesize high affinity binding elements. However, the extent of pharmacophores synthesized by the above two approaches are still several orders of magnitude lower than the diversity and affinity achieved by just a single CDR loop from an antibody molecule.
Several groups have investigated the ability of small molecules to interact with each other or encircle other small molecule targets; these are known as “guest-host” interactions or artificial receptors. However, these compounds are not suitable, because they are not of low enough molecular weight or interact under non-physiological conditions or would be too reactive with other intracellular molecules.
A common approach to designing artificial receptors is to construct a “molecular tweezer”, consisting of a two armed structure joined by a conformationally restricted linker, such that the two arms point in the same direction (analogous to a tweezer). These “host” structures are often designed with a dye or on a bead, and then screened for binding of the “guest”, most often a tri-peptide, again with either a dye or on a bead. (Shao et. al., J Org. Chem. 61:6086-6087 (1996); Still et. al., Acc. Chem. Res. 29:155-163 (1996); Cheng, et. al., J. Am. Chem. Soc. 118:1813-1814 (1996); Jensen et. al., Chem. Eur. J. 8:1300-1309 (2002)). In a variation of this theme, binding of the peptide displaces a quenched fluorescent group from the host pocket, thus creating a fluorescent signal upon binding (Chen, et. al., Science 279:851-853 (1998); Iorio et. al., Bioorganic & Medicinal Chem. Lett. 11:1635:1638 (2001)). Rigid diketopiperazine backbone receptors with tri-peptide arms have demonstrated both tight binding, as well as how small structural changes in the backbone significantly reduce that binding (Wennemers et al., Chem. Eur. J. 7:3342-3347 (2001); Conza et. al., J. Org. Chem. 67:2696-2698 (2002); Wennemers et al., Chem. Eur. J. 9:442-448 (2003)). Unsymmetrical tweezer and one-armed receptor hosts have been designed to mimic vancomycin binding of an L-Lys-D-Ala-D-Ala tripeptide guest (Shepard et al., Chem. Eur. J. 12:713-720 (2006); Schmuck et al., Chem. Eur. J. 12:1339-1348 (2006)). Other host-guest systems include napthalene-spaced tweezers and cyanobenzene derivatives (Schaller et al., J. Am. Chem. Soc. 129:1293-1303 (2007)). In some of the examples above, the selection was performed in organic solvents, and, in all cases, at least one of the entities had a molecular weight in excess of 400 and often in excess of 800. Thus, these examples would not be suitable for lead molecules.
Another approach to designing low molecular weight affinity binders is to use phage display. This approach was used to find peptides from 9-13 mers that bind fluorescent dyes; however, only one of these retained sufficient affinity to bind a dye when resynthesized outside the context of the phage protein (Rozinov et. al., Chemistry & Biology 5:713-728 (1998), Marks, et. al., Chemistry & Biology 11:347-356 (2004)). Other groups have used phage display to design synthetic peptides 8-12 mers that bind biotin (Saggio et. al., Biochem. J. 293:613-616 (1993)), camptothecin (Takakusagi et al., Bioorganic & Medicinal Chem. Lett. 15:4850-4853 (2005)), as well as doxorubicin and other hydrophobic cancer drugs (Popkov et al, Eur. J. Biochem. 251:155-163 (1998)). In all these cases, the fluorescent dye or similarly hydrophobic guest moiety is held in place by a pocket comprised from hydrophobic amino acids, and then additional residues may provide further stability. Since the peptides have molecular weights ranging from about 900 to about 1500, they are too large and not suitable for lead molecules.
Thus, there is a need to design new small molecules that associate with good affinities for one another under physiological conditions. Further there is a need to design such small molecules to bind to biological macromolecules with improved affinity and specificity and influence their structure, function, processing, degradation and role in signal transduction and cellular responses. The present invention is directed to overcoming this deficiency in the art.