The present invention relates to methods of identifying hot-spot residues of a member of a receptor-ligand binding pair of interest. The invention further provides methods of using information about receptor hot-spots for a receptor-ligand pair of interest to guide the identification of compounds that functionally bind to such regions of the receptor in a manner that mimics the ligand.
Protein-protein interactions such as those observed in many receptor-ligand complexes are typically mediated by large binding surfaces comprising ten to thirty contact amino acid residues on each protein of the complex (Clackson and Wells, 1995, Science 267:383-386). Recently, it has been suggested for some binding pairs that within each binding surface, the predominant contribution to the overall free energy of binding of the complex is due to only a few residues within each binding surface. (Clackson and Wells, 1995, supra).
For example, recent studies by Wells and collaborators have characterized the binding surfaces of the complex formed between human growth hormone and its cellular receptor. (Clackson and Wells, 1995, supra; Pearce et al., 1996, Biochemistry 35:10300-10307; Wells, 1996, Proc. Natl. Acad. Sci. USA 93:1-6). X-ray crystallographic studies had shown that approximately thirty amino acid side chains of the hormone contact approximately thirty amino acid side chains of the receptor. An array of mutant human growth hormone proteins and mutant receptor proteins were prepared to estimate the energetic contribution of each amino acid of the contact surface to the overall binding interaction. The study measured the binding affinities of various complexes between mutant hormones and native receptors and also between native hormones and mutant receptors.
Surprisingly, only a small number of amino acid residues within the extensive binding surfaces accounted for most of the binding energy. Overall, fewer than half of the contact residues contributed measurably to the binding interaction. Eight receptor residues and eleven hormone residues accounted for over 75% of the free energy of binding of the hormone-receptor complex. Within each group of residues on either binding pair, a limited number of xe2x80x9chot-spotxe2x80x9d amino acids contributed most to the total free energy of binding. Surrounding each xe2x80x9chot-spotxe2x80x9d was a group of several amino acids that contributed at lesser levels, and this group was in turn followed by a larger set of residues that contributed at even lower levels to the total free energy of binding. Most significantly, the xe2x80x9chot-spotxe2x80x9d residues on the receptor binding surface directly contacted the complementary xe2x80x9chot-spotxe2x80x9d amino acids on the hormone""s binding surface. Subsequent studies in other systems have also shown that the total binding affinity between a receptor and a ligand is mediated mainly by small and complementary sets of xe2x80x9chot-spotxe2x80x9d residues within the binding surfaces of the receptor and ligand (Wells, 1996, Science 273:449-450; Smith-Gill, 1994, Res. Immunol. 145:67-70, 1994). Some systems even suggest that multiple, non-contiguous xe2x80x9chot-spotsxe2x80x9d are possible.
While the methods of Wells and others can provide detailed information about residues involved in binding interactions for a variety of receptor-ligand pairs, they suffer from serious drawbacks. For example for a given receptor-binding protein pair, large panels of mutant proteins must be prepared in order to identify the thermodynamically significant or xe2x80x9chot-spotxe2x80x9d residues. To do so, each residue of the receptor or ligand must be mutated individually to measure its contribution to the overall free energy of binding.
Ideally, when a three-dimensional structure of the receptor-ligand complex is available, the mutations can be limited to only those amino acids comprising the binding surfaces of each member of the pair, approximately thirty residues for each number in a typical interaction. Thus, even in such an xe2x80x9cidealxe2x80x9d situation, where information about both the receptor and ligand is desired, a total of about sixty mutant proteins must be prepared (one for each residue in the binding surface of the receptor and one for each residue in the binding surface of the ligand), and the binding energy of each mutant protein for its native binding partner determined.
In a more typical situation where no three-dimensional structure is available for either binding partner, mutants must in principle be prepared for every residue of both proteins. Thus, even in an ideal situation the methods are labor-intensive and expensive. In a less than ideal situation, the time and expense of preparing the mutants and measuring binding affinities oftentimes would be comparable to that for obtaining a three-dimensional structure of the complex.
In addition, the methods do not necessarily accurately identify the hot-spot residues. Site-directed mutagenesis of individual amino acids does not simply remove an amino acid side chain from a protein. Often, mutation of a residue to, for example, an alanine, introduces local and perhaps global structural perturbations in the mutant protein. These structural changes may affect the binding interactions between the mutant protein and its ligand. As a consequence, a comparison of the binding affinities of the mutant and native proteins is unlikely to be an accurate measurement of the relative contribution that residue makes to the overall free-energy of binding. In fact, in the studies of Clackson and Wells, the sum of the apparent free energy contributions of the mutated amino acids exceeded the known free energy of binding of the native receptor-native ligand complex by a factor of two (see, Clackson and Wells, 1995, supra). Clackson and Wells concluded that the excess measured binding energy was due to mutant-induced structural changes.
Thus, it would be highly desirable to have available methods for identifying hot-spot amino acid residues for receptor-ligand pairs which do not suffer from the above-described limitations. In particular, it would be highly desirable to have available methods for identifying receptor-ligand hot-spot residues which are fast and inexpensive, and which further permit the identification of hot-spot residues for a native receptor-ligand pair without having to mutate either the receptor or ligand.
Small compounds that bind to, and thereby antagonize (antagonists) or activate (agonists) receptors of therapeutic importance are of considerable commercial value. However, to date the ability to rapidly and easily identify such small compounds, particularly those that are able to interact with protein receptors that have large polypeptide or protein ligands, is limited. In most instances, such compounds are obtained through the laborious process of rational drug design, which usually requires detailed knowledge about the three-dimensional structure of the receptor-ligand co-complex or other structure-function information. Rational drug design generally involves designing compounds based on available structure-function information on a compound-by-compound basis and screening the individual compounds in biological assays to identify those compounds which produce a desired biological activity. Usually, detailed structure-function or three-dimensional structural information for the receptor-compound complex is obtained so that the design hypotheses can be verified.
Typically, the detailed structure-function information necessary to design and assess the compounds is obtained from NMR or x-ray crystallographic studies with the receptor-ligand co-complex. Both of these technologies require large quantities of pure, co-complex, expensive, specialized equipment and highly skilled technicians, making such structural information extremely costly and time consuming to obtain. In addition, while the detailed structural information obtained from these methods can oftentimes identify those residues of each member of the binding pair that are involved in the binding interaction, neither of these methods have been used to identify the individual residues which make significant contributions to the overall free energy of binding of the receptor-ligand complex; in fact, x-ray crystallography simply cannot generate this detailed information. Moreover, to date, rational drug design methodologies have proven inadequate for designing receptor agonists, which are an increasingly important class of pharmaceutical targets. Induction of receptor activation by agonists requires more than simple receptor binding. Agonist activity usually results from receptor conformational change or receptor subunit oligomerization that is induced by ligand binding. Agonist ligands must be capable of binding to precisely the correct subregion(s) of the receptor and in the correct manner required for induction of these receptor changes. When agonists appropriately bind to receptors, a portion of the binding energy is employed to produce these receptor structural changes. Presently available structure determination technologies (x-ray crystallography, NMR) have great difficulty in identifying and localizing the parts of the receptor that participate in such ligand-induced structural changes, and in identifying the mechanisms by which they are induced by agonist binding.
Recently, xe2x80x9cnon-rationalxe2x80x9d combinatorial library methodologies have been developed, in part to ameliorate the drawbacks of rational drug design. In these non-rational methods, large libraries of compounds are synthesized and screened for the ability to bind a target receptor of interest. The structures of the identified compounds are elucidated and used as lead compounds for subsequent rounds of library synthesis and screening or as lead structures for rational drug design.
While these methods have the allure of being able to randomly or semi-rationally sample large sectors of xe2x80x9cconformation spacexe2x80x9d in a relatively short period of time, they suffer from many drawbacks. For example, in order to identify a lead compound from a purely random library, millions, and oftentimes trillions, of compounds must be synthesized and screened. Thus, while theoretically pharmaceutical leads can be identified from a purely random library, in practice the library methods are merged with rational design methods to create xe2x80x9cfocused librariesxe2x80x9d that have a higher probability of containing a suitable lead structure. Thus, even these library methodologies rely, to some extent, on structure-function information.
Moreover, while compounds which bind a target receptor can be identified, no information is provided about where on the receptor the compound binds. Whether the compound binds the receptor at or near the same site as the receptor""s natural ligand is simply unknown. Combinatorial library methodologies are also inadequate for identifying receptor agonists for substantially the same reasons discussed above.
The ability to identify where a candidate compound of interest binds a target receptor of interest without having to obtain expensive and laborious high resolution three-dimensional structures of the receptor, ligand or receptor-ligand co-complex would be highly desirable. Of particular interest would be having the ability to identify candidate compounds which bind a receptor at the same site as the receptor""s natural ligand or to identify compounds that act as receptor agonists. Current methods for achieving these goals are inadequate. Accordingly, these are objects of the present invention.
These and other shortcomings in the art are overcome by the present invention, which in one aspect provides a method for identifying hot-spot residues for one or both members of a receptor-ligand pair of interest. In the method of the invention, the rates of exchange between individual hydrogens of one or both members of the receptor-ligand pair and solvent hydrogens are determined for the member in both the bound and unbound states. The individual hydrogens are correlated to specific amino acid residues within the primary sequence of the member, thereby providing hydrogen exchange rates for hydrogens on specifically identified amino acid residues comprising the member. From these exchange rates, those residues which individually contribute at least about 10-20% of the overall free energy of binding of the receptor-ligand complex are identified as constituting the xe2x80x9chot-spotxe2x80x9d residues for that member of the receptor-ligand pair.
The method can be used with or without the aid of three-dimensional structure or other structure-function information about the particular receptor, ligand or receptor-ligand pair. The only requirement is that the primary amino acid sequence of the binding member of interest be known.
Furthermore, in general, the method does not preclude application to finding hot-spot residues in a non-native, or mutated receptor. Such an approach could find utility in, for example, establishing structure-function relationships, particularly because mutants often have unpredictable conformational changes with respect to the native receptor.
The methods described herein provide significant advantages over currently available methodologies for identifying receptor and/or ligand hot-spot residues. For example, according to the methods of the invention, hot-spot residues can be simply, rapidly and inexpensively identified for each member of a native receptor-ligand complex. Thus, unlike the methods described in the art, which require mutation of the receptor and/or ligand, the method of the invention permits, for the first time, reliable identification of hot-spot residues of receptors and/or ligands in their truly native, biologically relevant conformations.
In another aspect, the invention provides a method for identifying compounds that interact with a target receptor of interest in a manner similar to a known ligand for that receptor; i.e., the invention provides a method of identifying compounds which act as binding mimics of known receptor ligands. According to the method of the invention, receptor hot-spot residues for a receptor-ligand pair of interest are first identified. Candidate compounds are then screened with the receptor to identify those candidate compounds that interact with at least one, and preferably at least a majority or even more, of the receptor hot-spot residues. Those candidate compounds that interact with at least one receptor hot-spot residue are selected as small molecule binding xe2x80x9cmimicsxe2x80x9d of the known ligand.
For any given receptor-ligand pair, the receptor hot-spot residues can be determined by any of several techniques, including, for example, the methods described herein, hydrogen exchange with heavy hydrogen (deuterium or tritium) in conjunction with nuclear magnetic resonance (NMR) or mass spectroscopy, tritium exchange in conjunction with radioactivity measurements and in conjunction with site-directed mutagenesis. The precise binding interactions between the receptor and candidate compounds can also be characterized by any of these methods to identify compounds that xe2x80x9cmimicxe2x80x9d the essential receptor binding activity of the known ligand.
Each candidate compound may be first assayed for a particular biological activity of interest, such as the ability to bind, antagonize or agonize the receptor, to identify those candidate compounds which exhibit the desired biological activity, and then the active candidate compounds further screened to identify those that interact with a receptor hot-spot residue. Alternatively, the candidate compounds may be directly screened to identify those that interact with a receptor hot-spot residue without first assaying for biological activity.
Selection of the initial candidate compounds to be screened can be guided by a structural model of the ligand or the receptor-ligand complex, by available structure-function information for the receptor-ligand complex, or by completely random processes. The initial candidate compounds can be synthesized individually or can be synthesized using any of the well-known combinatorial library methodologies. Candidate compounds, and in particular combinatorial libraries of candidate compounds, that are first screened for biological activity, particularly binding activity, can be screened in pools in a batch-wise fashion using any of the numerous screening methods described in the art to identify biologically active compounds for further analysis according to the invention.
Ideally, the method of the invention is used iteratively to identify ligand mimics. In the iterative method, the results of one round of screening guide the selection of candidate compounds for the next round of screening. For example, the structures of candidate compounds identified in a first round of hot-spot screening can be determined and these structures used to guide the synthesis of new or modified candidate compounds. After several rounds of synthesis and screening, compounds are identified that precisely mimic the known ligand; i.e., compounds are identified that precisely interact with those receptor hot-spot residues contacted by the known ligand.
The methods of the invention for identifying compounds which bind a receptor provide significant advantages over currently available techniques. Because the preferred methods utilize sensitive hydrogens on the amino acid side-chain or backbone as reporters of local changes in environment, the methods are able to identify compounds that interact with a target receptor of interest at the most biologically significant residuesxe2x88x92those receptor residues that contribute most to binding the receptor""s natural ligand. Quite significantly, the methods do not require detailed structure-function or three-dimensional structural information for the receptor, ligand, receptor-ligand co-complex or receptor-compound co-complex, though interpretation of results is enhanced by the availability of such information. The only requirement is that the primary amino acid sequence of the receptor of interest be known. Thus, the methods of the invention permit the identification of xe2x80x9cligand mimicsxe2x80x9d, particularly small compound ligand mimics of the essential binding function of the natural ligand, with unprecedented ease.
Moreover, the methods provide insights into the nature of the ligand-binding induced receptor structural changes required for agonist activity. This, combined with the ability to focus small molecule design to known ligand-defined hot-spots allows ready identification of compounds that function as receptor agonistsxe2x88x92a feat that is simply unprecedented in the art.
Thus, in a final aspect the invention provides methods of identifying compounds, particularly small organic compounds, which act as agonists for a target receptor of interest. According to the method of the invention, hydrogen exchange techniques as previously described are used to identify three distinct classes of receptor amino acid residues for a receptor-agonist pair of interest:
(i) those that reside in the receptor-agonist binding surface;
(ii) those that participate in conformational changes as a consequence of receptor-agonist binding; and
(iii) that participate in binding-induced receptor sub-unit oligomerization. The receptor is then screened with candidate compounds using the hydrogen exchange techniques described herein to identify those which interact with at least one receptor amino acid residue from class (ii) or (iii), above. The candidate compound is considered to interact with one of these receptor residues if the receptor residue has a protection factor of at least one upon formation of a receptor-candidate compound complex. The candidate compounds may be screened for biological activity prior to screening in the methods of the invention, as previously described.
The methods may be used iteratively in a manner similar to that described above to identify compounds which interact with a majority of the receptor amino acid residues of class (ii) or (iii).
As used herein, the following terms shall have the following meanings:
xe2x80x9cLigand:xe2x80x9d refers to any molecule which is capable of binding a receptor. The ligand may be a small organic compound, or a large molecule such as a polypeptide or protein.
xe2x80x9cReceptor:xe2x80x9d refers to any molecule capable of binding a ligand. Thus, as used herein receptor refers not only to molecules generally recognized as belonging to the class of binding molecules designated as xe2x80x9creceptors,xe2x80x9d such as, for example, receptors for cytokines, growth factors, chemokines, hormone receptors, adhesion receptors, apoptosis receptors, etc., but is also intended to include any molecule which can bind another molecule, including for example, an antibody or an enzyme.
The receptor may be identical in primary amino acid sequence to a naturally occurring receptor, or it may be a functionally active fragment, mutant or derivative thereof. The fragment, mutant or derivative may have the same or different binding or other biological characteristics relative to the parental protein.
Receptors of particular interest include integral membrane proteins, as they are difficult to crystallize for study by X-ray diffraction. Receptors too large to study by NMR methods, e.g., those larger than about 50 kDa, are also of special interest, particularly if they cannot be characterized as a composite of two or more separately analyzable domains. Examples of receptors of particular interest include integrins (which are large integral membrane proteins), cell surface receptors for growth factors (including cytokine receptors), xe2x80x9cseven-spanners,xe2x80x9d selectin, and cell surface receptors of the immunoglobin superfamily (e.g., ICAM-1).
As will be appreciated by those of skill in the art, the designation of xe2x80x9creceptorxe2x80x9d and xe2x80x9cligandxe2x80x9d is somewhat arbitrary. Typically the term xe2x80x9creceptorxe2x80x9d is conferred upon the larger of the two binding partners, or upon the member of the binding pair which is a protein. For purposes of the present invention, the expressions xe2x80x9creceptorxe2x80x9d and xe2x80x9cligandxe2x80x9d do not carry such limitations; they merely each refer to one member of a pair of binding molecules.
xe2x80x9cProtein:xe2x80x9d as used herein includes, mutatis mutandis, polypeptides, oligopeptides and derivatives thereof, including, by way of example and not limitation, glycoproteins, lipoproteins, phosphoproteins and metalloproteins. The essential requirement for a molecule to be considered a protein is that it feature one or more peptide (xe2x80x94HNC(O)xe2x80x94) bonds, as the amide hydrogen of the peptide bond (as well as those on the side chains of certain amino acid residues) has certain properties which permit it to be analyzed by hydrogen exchange.
xe2x80x9cAgonist:xe2x80x9d as used herein refers to a ligand that, upon binding to a receptor, triggers activation of a chemical signaling cascade that results in a definable change in the behavior or physical or biological state of a cell.
xe2x80x9cAntagonist:xe2x80x9d as used herein refers to a molecule that, by virtue of binding to a receptor or agonist, is able to block the cell-activating influence of the agonist, and which itself does not result in substantial activation of the cell.
xe2x80x9cLabile Hydrogen:xe2x80x9d is a general description of any hydrogen that may be exchanged with a solvent hydrogen under the conditions of the exchange experiment. One category of such hydrogens is the amide hydrogens found on the protein or peptide backbone. Exchange of these hydrogens, as described herein, is referred to as xe2x80x9camide hydrogen exchangexe2x80x9d. Another category comprises hydrogens bound to any carbon on the amino acids comprising the protein, including backbone and side chains that may temporarily be rendered labile under specific experimental conditions. Exchange of these hydrogens, as described herein, is referred to as xe2x80x9calkyl hydrogen exchange.xe2x80x9d
xe2x80x9cConditions of Slowed Hydrogen Exchange:xe2x80x9d refers to conditions in which the rate of labile hydrogen exchange at residue atoms freely exposed to solvent is reduced substantially, i.e., enough to perform the various manipulations necessary to handle the samples without undue or unacceptable loss of solvent-exposed labile hydrogens.
Conditions of slowed alkyl hydrogen exchange are easily, almost trivially achieved as alkyl exchange occurs only when particular experimental conditions temporarily induce it.
Amide hydrogen exchange, however, is always occurring, to some extent, with a rate that is a function of factors including temperature, pH and solvent composition. The rate is decreased three fold for each 10xc2x0 C. drop in temperature. Hence use of temperatures close to 0xc2x0 C. is preferred. In water, the minimum amide hydrogen exchange rate is at a pH of 2-3. As conditions diverge from the optimum pH, the amide hydrogen exchange rate increases, typically by 10-fold per pH unit increase or decrease away from the minimum. Use of high concentrations of a polar, organic cosolvent shifts the pH minimum to higher pH, potentially as high as pH 6 and perhaps, with the right solvent, or even higher.
By way of example, at pH 2.7 and 0xc2x0 C., the typical half life of a tritium or deuterium label at an amide position freely exposed to solvent water is about 70 minutes. Preferably, the slowed conditions of the present inventions result in a half-life for a freely exposed amide hydrogen of at least 10 minutes, more preferably at least 60 minutes.