This invention relates to methods of identifying drugs which can mediate the biological activity of a target protein.
Protein Binding and Biological Activity
Many of the biological activities of the proteins are attributable to their ability to bind specifically to one or more binding partners (ligands), which may themselves be proteins, or other biomolecules.
When the binding partner of a protein is known, it is relatively straightforward to study how the interaction of the binding protein and its binding partner affects biological activity. Moreover, one may screen compounds for the ability of the compound to competitively inhibit the formation of the complex, or to dissociate an already formed complex. Such inhibitors are likely to affect the biological activity of the protein, at least if they can be delivered in vivo to the site of the interaction.
If the binding protein is a receptor, and the binding partner an effector of the biological activity, then the inhibitor will antagonize the biological activity. If the binding partner is one which, through binding, blocks a biological activity, then an inhibitor of that interaction will, in effect, be an agonist.
The residues whose functional groups participate in the ligand-binding interactions together form the ligand binding site, or paratope, of the protein. Similarly, the functional groups of the ligand which participate in these interactions together form the epitope of the ligand.
In the case of a protein, the binding sites are typically relatively small surface patches. The binding characteristics of the protein may often be altered by local modifications at these sites, without denaturing the protein.
While it is possible for a chemical reaction to occur between a functional group on a protein and one on a ligand, resulting in a covalent bond, protein ligand binding normally occurs as a result of the aggregate effects of several noncovalent interactions. Electrostatic interactions include salt bridges, hydrogen bonds, and van der Waals forces.
What is called the hydrophobic interaction is actually the absence of hydrogen bonding between nonpolar groups and water, rather than a favorable interaction between the nonpolar groups themselves. Hydrophobic interactions are important in stabilizing the conformation of a protein and thus indirectly affect ligand binding, although hydrophobic residues are usually buried and thus not part of the binding site.
Peptides have been found to bind proteins at the same sites as those by which the proteins interact with other proteins, macromolecules and biologically significant substances e.g. nucleic acids, lipids and enzyme substrates. The first examples of this property were in the publications of several groups who showed that there is a single peptide binding site on the biotin binding protein streptavidin. This is the same site responsible for biotin binding and these peptides compete with biotin for binding to this site (Biochemistry 34: 15430-15435 (1995) Screening of cyclic peptide phage libraries identifies ligands that bind streptavidin with high affinities, L. B. Giebel, R. T. Cass, D. L. Milligan, D. C. Young, R. Arze and C. R. Johnson; Gene 128: 59-65 (1993) An M13 phage library displaying random 38-amino-acid peptides as a source of novel sequences with affinity to selected targets, B. K. Kay, N. B. Adey, Y. S. He, J. P. Manfredi, A. H. Mataragnon and D. M. Fowlkes; Nature 354: 82-4 (1991) A new type of synthetic peptide library for identifying ligand-binding activity Septou, et al.; Proc Natl Accad Sci USA 92: 5426-5430 (1995) Library of libraries: approach to synthetic combinatorial library design and screening of xe2x80x9cpharmacophorexe2x80x9d motifs, I. Saggio and R. Laufer; Biochem J 293 (Pt 3): 613-6 (1993) Biotin binders selected from a random peptide library expressed on phage, I. Saggio and R. Laufer). Many other examples exist, for instance Smith demonstrated that peptides displayed on phage which bound to ribonuclease S had a specific consensus motif and that these PLs were antagonistic to ribonuclease activity, implying that the peptides and the RNA were bound by the same ligand binding site (Gene 128: 37-42 (1993) A ribonuclease S-peptide antagonist discovered with a bacteriophage display library, G. P. Smith, D. A. Schultz and J. E. Ladbury). Another example is from the binding of peptide ligands to cell surface integrins (Biochemistry 34: 3948-3955 (1995) Peptide ligands for integrin alpha v beta 3 selected from random phage display libraries, J. M. Healy, O. Murayama, T. Maeda, K. Yoshino, K. Sekiguchi and M. Kikuchi; J Cell Biol 124: 373-80 (1994) Isolation of a highly specific ligand for the alpha 5 beta 1 integrin from a phage display library, E. Koivunen, B. Wang and E. Ruoslahti). Peptides obtained in this way clearly mimic natural protein:protein interactions as in the case for the proteins MDM2 and p53 (Bottger et al. Identification of novel mdm2 binding peptides by phage display, Oncogene, 13:2141-7 (1996)). However, it has not hitherto been appreciated that this phenomenon is sufficiently common so that it might be exploited in identifying inhibitors of the interaction of a protein with an unknowing binding partner. Nor have others explained just how to take advantage of this phenomenon for that purpose.
Traditional Drug Screening
In traditional drug screening, natural products (especially those used in folk remedies) were tested for biological activity. The active ingredients of these products were purified and characterized, and then synthetic analogues of these xe2x80x9cdrug leadsxe2x80x9d were designed, prepared and tested for activity. The best of these analogues became the next generation of xe2x80x9cdrug leadsxe2x80x9d, and new analogs were made and evaluated.
Both natural products and synthetic compounds could be tested for just a single activity, or tested exhaustively for any biological activity of the interest to the tester. Testing was originally carried out in animals, later, less expensive and more convenient model systems, employing isolated organ, tissue, or cell cultures, membrane extracts or purified receptors, were developed for some pharmacological evaluations.
These methods have many disadvantages. Many of these approaches require large amounts of chemical compound to test, especially testing in whole animals and isolated organs. Since the quantity of a given compound within a collection of potential medicinal compounds is limited, this requires one to limit the number of screens executed.
Also, it is inherently difficult to establish structure/activity relationships (SAR) among compounds tested using whole animals, isolated organs and cultured cells. This is because the actual molecular target of any given compound""s action may be quite different from that of other compounds scoring positive in the assay. By testing a battery of compounds on a very specific target, one can correlate the action of various chemical residues with the quantitative activity and use that information to focus ones search for active compounds among certain classes of compounds or even direct the synthesis of novel compounds having a composite of the properties shared by the active compounds tested.
Another disadvantage to whole animal, organ and cell based screening is that certain limitations may prevent an active compound from being scored as such. For instance, an inability to pass through the cellular membrane may prevent a potent inhibitor, within a tested compound library, from acting on the activated oncogene ras and giving a spurious negative score in a cell proliferation assay. However, if it were possible to test ras in an isolated system, that potent inhibitor would be scored as a positive compound and contribute to the establishment of a relevant SAR. Subsequent, chemical modifications could then be carried out to optimize the compound structure for membrane permeability.
The overwhelming disadvantage to the receptor based methods for screening compounds is that they require a priori knowledge about the activity of receptor and its biological ligand. If through genetic mapping of a disease loci one determines that a particular gene product is responsible for the disease; and one lacks knowledge about the gene biochemical function because it is not a previously known receptor or enzyme, then it is very difficult to establish an assay with the methods previously known.
The present invention circumvents all these problems.
Combinatorial Libraries
Libraries of thousands, even millions, of random oligopeptides have been prepared by chemical synthesis (Houghten et al., Nature, 354:84-6(1991)), or gene expression (Marks et al., J Mol Biol, 222:581-97(1991)), displayed on chromatographic supports (Lam et al., Nature, 354:82-4(1991)), inside bacterial cells (Colas et al., Nature, 380:548-550(1996)), on bacterial pili (Lu, Bio/Technology, 13:366-372(1990)), or phage (Smith, Science, 228:1315-7(1985)), and screened for binding to a variety of targets including antibodies (Valadon et al., J Mol Biol, 261:11-22(1996)), cellular proteins (Schmitz et al., J Mol Biol, 260:664-677(1996)), viral proteins (Hong and Boulanger, Embo J, 14:4714-4727(1995)), bacterial proteins (Jacobsson and Frykberg, Biotechniques, 18:878-885(1995)), nucleic acids (Cheng et al., Gene, 171:1-8(1996)), and plastic (Siani et al., J Chem Inf Comput Sci, 34:588-593(1994))
Libraries of proteins (Ladner, U.S. Pat. No. 5,223,409 (Ser. No. 07/664,989, filed Mar. 1, 1981), peptoids (Simon et al., Proc Natl Acad Sci USA, 89:9367-71(1992)), nucleic acids (Ellington and Szobtak, Nature, 246:818(1990)), carbohydrates, and small organic molecules (Eichler et al., Med Res Rev, 15:481-96(1995)) have also been prepared or suggested for drug screening purposes.
Sparks, et al., Nature Biotechnology, 14:741 (June 1996) used an SH3 domain-binding peptide isolated from a phage-displayed random peptide library to screen a 16-day mouse embryo cDNA expression library for proteins with SH3-domains. This process is referred to as xe2x80x9cCOLTxe2x80x9d (cloning of ligand targets). These proteins, some of which were not previously known, may then be used as binding targets in screening peptide libraries for additional SH3-domain-binding ligands.
The chemistry of peptide libraries is quite similar to many of the natural macromolecules involved in biological processes and thus these libraries are rich in structures that mimic the natural ones which interact with the target protein. In addition, the variants are composed of linear polymers such that each actually represents a sliding window of many differing chemical constituents. For instance, if a given macromolecular interaction is based on the side chains of four amino acids within a binding peptide, then a 13 amino acid peptide has 10 potential combinations of residues which may bind; therefore a library of 108 members has about 109 4-mer permutations. This, combined with ease of producing and screening exceptionally large and diverse peptide libraries, provides the incentive to use peptide combinatorial libraries for the initial identification and probing of protein functional domains.
Unfortunately, peptides per se have limited utility for use as therapeutic entities. They are costly to synthesize, unstable in the presence of proteases and in general do not transit cellular membranes. Other classes of compounds have better properties for drug candidates. However, historically, acquiring chemical compound libraries has been a barrier to the entry of smaller firms into the drug discovery arena. Due to the large quantity of chemical required for testing on whole animals and even on cells in culture, it was a given that whenever a compound was synthesized it should be done in fairly large quantity. Thus, there was a synthesis and purification throughput of less than 50 compounds per chemist per year. Large companies maintained their immensely valuable collections as trade barriers. However, with the downsizing of targets to the molecular level and the automation of screens, the quantity of a given compound necessary for an assay has been reduced to very small amounts. These changes have opened the door for the utilization of so-called combinatorial chemistry libraries in lieu of the traditional chemical libraries. Combinatorial chemistry permits the rapid and relatively inexpensive synthesis of large numbers of compounds in the small quantities suitable for automated assays directed at molecular targets. Numerous small companies and academic laboratories have successfully engineered combinatorial chemical libraries with a significant range of diversity (reviewed in Doyle, 1995, Gordon et al, 1994a, Gordon et al, 1994b).
We have developed a systematic means for development of drug discovery screens for numerous targets. One of the special advantages of this system is that the high throughput screens are essentially identical for similar and dissimilar targets, bypassing the need to develop distinct assays for biochemically diverse targets. This is desirable for several reasons. First and foremost, one is never certain how useful a specific target is for therapeutic intervention. It is not until active compounds have been isolated and tested that one can truly xe2x80x9cvalidatexe2x80x9d a molecular target. Thus it makes sense to chose as many targets as practical, establish screens for each and then validate each target pharmacologically using the identified compounds. Second, for many potential targets one may not be aware of a biochemical activity that can be used to establish molecular assays. Many potential targets can be proposed based upon the results of genetic experimentation rather than biochemical data. This has been the case for viruses due to ease of subcloning and mutagenic analysis and, now, with the outpouring of human genetic data, shall be true in many other disease areas. The challenge is to go from genetic data to development of useful drug screens.
DGI Technologies, WO96/04557 corresponding to Blume, U.S. Pat. No. 6,010,861, xe2x80x9cTarget Specific Screens and Their Use for Discovering Small Organic Molecular Pharmacophoresxe2x80x9d, suggests first screening a library composed of mutated variable domains of antibodies (V-H, V-L, or single chain antibodies, which are V-H and V-L domains joined by a peptide linker) for domains which bind the target. Preferably, the parental variable domains have a solved 3D structure (p. 44).
The targets of principal interest to DGI are cell-surface receptors (pp. 149-50). They speculate that an antibody library will survey the entire surface of the target (pp. 5, 39). They are particularly interested in finding antibodies which bind a target receptor protein at a site other than the receptor""s endogenous ligand binding site (p. 4).
The antibodies of interest to DGI are those which are both T+ (bind the target) and A+ (activate the target, or are capable of activating the target when combined with another ligand) (pp. 42, 14).
DGI prefers to sequence the T+A+ antibodies, predict their 3D structure on the basis of the known structure of the parental antibody, and design small organic pharmaccphores which mimic the binding conformation of the CDRs of the antibody (pp. 78-91). However, it does contemplate that one could use labeled T+A+ antibodies in competitive binding assays to screen xe2x80x9cchemical librariesxe2x80x9d for binding activity (pp. 91-93, 42). The chemical libraries contemplated are the kind available from Alldrich and Kodak (p. 12), which are not combinatorial libraries. That is, they are merely accretive collections of biologically active compounds. While some compounds may be related, because they came from a single research program, the collection as a whole is a hodgepodge.
All references, including any patents or patent applications, cited in this specification are hereby incorporated by reference. No admission is made that any reference constitutes prior art. The discussion of the references states what their authors assert and applicants reserve the right to challenge the accuracy and pertinency of the cited documents.
The present invention relates to a method of identifying drugs which can mediate the biological activity of a target protein via inhibition of binding of the target protein to a binding partner. Unlike prior methods, it does not require that the natural binding partner be used as a reagent, or even that it have been characterized. The need for the natural binding partner is obviated by the use of complementary combinatorial libraries.
Applicants screen a first combinatorial library for binding to the target protein. Preferably, this library is a biopolymer library, and, more preferably, an amplifiable library. Applicants then screen a second (complementary) library(preferably combinatorial in nature) for the ability to inhibit the binding of one or more of the target binding ligands in the first library to the target protein. The members of this library are typically small organic compounds, and more suitable as drugs or drug leads than the compounds of the first library.
The successful inhibitors are candidate antagonists of one or more of the biological activities of the target protein.
Applicants believe that those members of a combinatorial library, especially a biopolymer library, which bind to a target protein having a biologically significant binding activity will bind preferentially to the sites at which the target protein interacts with the natural binding partners which mediate its biological activity, as opposed to randomly, with equal probability, over the entire surface of the target protein. If so, then the target-binding members of the library in question can be used as surrogates for an unknown or unavailable natural binding partner in screening a second combinatorial library (the xe2x80x9ccomplementary libraryxe2x80x9d), which need not be a biopolymeric library, for members which can inhibit the complexing of target protein to its natural binding partner. The surrogate library is preferably an amplifiable (peptide or nucleic acid) library.
The active sites of proteins, by which they interact with other molecules, may consist of one or more concavities (depressions) and/or convexities (protuberances) on the surface of the protein. Generally speaking, oligopeptides, oligonucleotides, and other small organic molecules, are most likely to bind to a protein by fitting, in whole or in part, into a concavity on the surface of the protein.
Preferably, the peptides of the first library are of 5-50 amino acids. Peptides of at least five amino acids length are sufficiently large to bind to an active site with a reasonably high activity. Peptides larger than about 50 amino acids are generally too large to fit into an active site, and also are more cumbersome to synthesize.
The surrogate peptides of the present invention are superior to antibodies for use as surrogate ligands in the identification of pharmaceutically useful small organic compounds. Structural data from diverse sources indicate that the binding sites of antibodies are deep pockets in the surface of antibodies, and are therefore are less suitable than oligopeptides for binding to concavities on the surface of targets. Such concavities are the probable binding sites of small ligands.
Unlike peptide ligands, antibodies do not appear to discriminate the endogenous ligand binding sites from the remainder of a target protein. Hence, they are a poor choice for a surrogate ligand.
For the foregoing reasons, when the first combinatorial library is composed of peptides, the peptides do not comprise an antibody-like domain. Thus, the peptides cannot be antibodies, single chain antibodies, or isolated variable heavy or light domains thereof. This exclusion applies both to naturally occurring antibodies, and to mutant antibodies which retain the normal structure of an antibody variable domain. The term xe2x80x9cantibody-like domainxe2x80x9d refers to a peptide having the normal structure of an antibody variable domain, as defined at col. 15, lines 59-68 of Ladner, U.S. Pat. No. 5,403,484.
The term xe2x80x9clibraryxe2x80x9d generally refers to a collection of chemical or biological entities which can be screened simultaneously for a property of interest. (They may be screened sequentially, if desired, but simultaneous screening is more efficient.) Typically, they are related in origin, structure, and/or function.
The term xe2x80x9ccombinatorial libraryxe2x80x9d refers to a library in which the individual members are either systematic or random combinations of a limited set of basic elements, the properties of each member being dependent on the choice and location of the elements incorporated into it. Typically, the members of the library are at least capable of being screened simultaneously. Randomization may be complete or partial; some positions may be randomized and others predetermined, and at random positions, the choices may be limited in a predetermined manner. The members of a combinatorial library may be oligomers or polymers of some kind, in which the variation occurs through the choice of monomeric building block at one or more positions of the oligomer or polymer, and possibly in terms of the connecting linkage, or the length of the oligomer or polymer, too. Or the members may be nonoligomeric molecules with a standard core structure, like the 1,4-benzodiazepine structure, with the variation being introduced by the choice of substituents at particular variable sites on the core structure.
The ability of one or more members of such a library to recognize a target molecule is termed xe2x80x9cCombinatorial Recognitionxe2x80x9d.
In a xe2x80x9csimple combinatorial libraryxe2x80x9d, all of the members belong to the same class of compounds (e.g., peptides) and can be synthesized simultaneously. A xe2x80x9ccomposite combinatorial libraryxe2x80x9d is a mixture of two or more simple libraries, e.g., DNAs and peptides. The number of component simple libraries in a composite library will, of course, normally be smaller than the average number of members in each simple library, as otherwise the advantage of a library over individual synthesis is small.
A biased combinatorial library is one in which, at one or more positions in the library member, only one of the possible basic elements is allowed for all members of the library, i.e., the biased positions are invariant.
The term xe2x80x9camplifiable combinatorial libraryxe2x80x9d refers to a library in which the individual members, after found to bind to a target, may be amplified in vivo or in vitro, using elements already present in the library as starting materials. There are two classes of amplifiable members. First, micleic acids may be amplified in vivo through natural replicative processes, or in vitro through techniques such as polymerase chain reaction (PCR). Second, peptides, when presented on phage, or otherwise associated with an encoding nucleic acid, may be amplified indirectly by in vivo or in vitro amplification of the associated nucleic acid encoding the peptide, the amplified nucleic acid being expressed to produce the peptide.
The term xe2x80x9cbiopolymeric libraryxe2x80x9d refers to a library composed of peptides (together with peptoids), nucleic acids, and/or oligosaccharides. (It is not necessary that they be composed of naturally occurring amino acids, bases, or sugars, respectively.) However, because of the greater complexity of carbohydrate synthesis, peptides and nucleic acids are of greater interest.
A xe2x80x9cpanel of combinatorial librariesxe2x80x9d is a collection of different (although possibly overlapping) and separately screenable simple or composite combinatorial libraries. A xe2x80x9cpanelxe2x80x9d differs from a composite library in that the component simple libraries have not been mixed together, that is, they may still be screened separately.
A xe2x80x9cstructural panelxe2x80x9d is a panel as defined above where there is some structural relationship between the member libraries. For example, one could have a panel of 20 different biased peptide libraries where, in each library, the middle residue is held constant as a given amino acid, but, in each library the constant residue is different, so, collectively, all 20 possible genetically encoded amino acids are explored by the panel.
A xe2x80x9cscanning residue libraryxe2x80x9d refers to the preparation of panel of biased combinatorial peptide libraries such that the position of the constant residue shifts from one library to the next. For example, in library 1, residue 1 is held constant as a particular residue AA, in library, residue 2 is, and so forth through two or more (usually all) positions of the peptide.
One may have structured panels of libraries in which one may define subpanels, too. For example, in one subpanel, the middle residue AA1 may be the same for all libraries, but the libraries also have a constant residue AA2 which is scanned through all other residue positions.
A library screening program is a program in which one or more libraries (e.g., a structured panel of biased peptide libraries) are screened for activity. The libraries may be screened in parallel, in series, or both. In serial screening, the results of one screening may be used to guide the design of a subsequent library in the series.
The size of a library is the total number of molecules in it, whether they be the same or different. The diversity of a library as the number of different molecules in it. xe2x80x9cDiversityxe2x80x9d does not measure how different the structures of the library; the degree of difference between two structures is referred to here as xe2x80x9cdisparityxe2x80x9d or xe2x80x9cdispersionxe2x80x9d. The xe2x80x9cdisparityxe2x80x9d is quantifiable in some respects, e.g., size, hydrophilicity, polarity, thermostability, etc. The average sampling frequency of a library is the ratio of size to diversity. The sampling frequency should be over the detection limit of the assay in order to assure that all members are screened.
The combinatorial libraries usually will have a diversity of at least 103 different structures. Preferably, the initial, surrogate-generating library is of high diversity, e.g., preferably at least about 106, more preferably at least about 109 different members. While a peptide library is preferred, a library composed of a different class of compounds (e.g., peptoids or nucleic acids) is acceptable if there would be a detectable preference for binding the activity-mediating binding sites of the target protein.
The complementary library need not be, and preferably is not, a peptide library and it may be of lower overall diversity. It may be screened against all of the surrogate peptides; or only against selected ones. The screenings may be individual or collective. Often, the members of the complementary library will be less specific in their binding to the paratopes of the target protein than are the members of the first library, possibly because their surface area is smaller and offers fewer opportunities for favorable (or unfavorable) interactions with other molecules. A preferred complementary library is a benzodiazepine library.
The degree of complex-inhibitory activity of the members of the complementary library may be quantified by means of a labeled surrogate peptide and an insolubilized target protein. Either the amount of labeled surrogate peptide is fixed, and the amount of complementary compound varied, or, more preferably, the amount of labeled surrogate peptide is varied and the amount of complementary compound held constant. The greater the activity of the complementary compound, the less labeled surrogate peptide will be in the solid phase (i.e., complexed to the target protein) and the more will be in the liquid phase (i.e., uncomplexed). The amount of label in either phase is then measured and correlated with the amount of the variable component. Conventional method of screening libraries for binding molecules do not lend themselves to quantification of the degree of affinity.
It is possible that some of the target protein binding members of the first library will not bind the target protein at the site bound by the natural binding partner which mediates the biological activity of interest, or bind it that site but still do not have an effect similar to that of the natural binding partner, i.e., that these nominal surrogates are not true surrogates for the natural binding partner. However, as long as one or more of the identified members are true surrogates, if all of the nominal surrogates are used in screening the complementary library, then one necessarily will screen for inhibitors of the binding of the true surrogates to the target protein, too.
To reduce the number of xe2x80x9cfalse hitsxe2x80x9d generated (i.e., compounds which inhibit the binding of a false surrogate to the target protein, or which inhibit binding of a true surrogate but at the wrong site), one may first test the nominal surrogates in a suitable biological system, for the ability to interact with the target protein so as to mediate its biological activity of interest (or at least a related activity that is evaluatable in that biological system). Then only those nominal surrogates which are active in this model system are used in screening the complementary library.
It is expected that most of the compounds of the complementary library which inhibit the complexing of the surrogate peptide to the target protein will achieve this inhibition by binding to the target protein in such a manner as to block its interaction with the surrogate peptide. While it is theoretically possible that the complementary compound will bind to the surrogate peptide instead of the target protein, this interaction is likely to be weak, since most oligopeptides do not have a stable conformation.
It is, of course, a simple matter to distinguish inhibitory compounds which bind the target protein from those which bind the surrogate peptide by use of either the target protein or surrogate peptide alone, in labeled or immobilized form, as an assay or affinity separation reagent.