1. Field of the Invention
The invention generally relates to the analysis of ligand binding to a macromolecule. In particular, the invention provides a method for tracing the path of a ligand binding signal through the three-dimensional structure of a macromolecule by determining the stability of the residues of the macromolecule in the presence and absence of the ligand.
2. Background of the Invention
The process of developing new drugs has been revolutionized by the advent of molecular biology and sophisticated computer technology. Approaches now focus on molecular modeling of virtual molecules in an attempt to predict appropriate drug candidates, the use of combinatorial chemistry to generate a large number of potential compounds, and high throughput screening techniques to identify which of the large number of compounds display affinity for the target molecule. Currently, the time line for the development of a new drug is 12 to 15 years. Statistics show that, for each new drug that actually goes to market, approximately 5,000 compounds are initially screened, approximately 5 are eventually deemed suitable for human clinical trials, and only one of the five will gain FDA approval. The entire process can cost as much as $500 million, and the cost of candidates which fail late in the development process incur huge, unrecoverable expenses. Therefore, methodologies which enhance the ability of investigators to predict which compounds are likely to possess specific desired characteristics and elicit the desired effects in target molecules are much sought after in the industry. It is especially desirable if the methodologies can be applied early in the developmental process and thus avoid fruitless developmental endeavors.
Many drug candidates are small molecules which bind to proteins and regulate their function. Frequently, the proteins, especially enzymes, possess allosteric regulatory sites located relatively distant from their active site. The initial interaction of a ligand (e.g. an inhibitor, hormone, substrate, agonist, etc.) may occur at the local, regulatory binding site and typically involves only a few residues. However, the effects of binding are often propagated to the remote active site in the protein and may ultimately involve many residues. Ligand binding to the regulatory site may ultimately activate or inhibit the protein by affecting the ability of the distal site to be binding competent towards an interacting partner in, for example, a signaling cascade. Binding of the ligand to the regulatory site may stabilize or destabilize the distal binding site, therefore affecting an entire molecular or cellular pathway. The chain of events initiated by initial ligand binding thus provides the basis for fundamental biological phenomena such as allosteric regulation, signal transduction and structural stability modification. Whatever the functional expression of the interaction, a necessary condition for proper functioning of the regulatory switch is the coupling (i.e. the atomic xe2x80x9cwiringxe2x80x9d) between the regulatory and active sites.
While much progress has been made with respect to the characterization of ligand binding events, it is still not possible to predict with certainty the outcome of binding at the molecular level. Standard methods of analysis include protein computational analysis, such as molecular mechanics or molecular dynamics. These methods are typically designed to identify a single conformation of a molecule (or molecule-ligand complex) that is predicted to be the most energetically favorable and therefore the most likely to represent the true conformation of the molecule or complex. The methods are generally based on the analysis of only one molecule or complex at a time in the calculations. By performing bond rotations over different dihedral angles, these methods generate a large number of conformations, usually in a sequential manner. The energy of each conformation is computed by using different search or minimization algorithms, and the conformation with the lowest energy is identified.
However, during the past decade, it has become evident that this approach to the study of molecular conformation is inadequate. This is because, in reality, a protein (or other molecule) actually exists in the native state as an ensemble of many thermodynamically available conformational states of varying degrees of population, rather than as a single discrete state. This is in part because the energy of stabilization of the structure of a protein is not evenly distributed throughout the molecule. Proteins are instead characterized by the occurrence of multiple independent local folding/unfolding events, i.e. proteins lack global cooperativity. The degree of population of any given conformational state of the many which are possible is governed by statistical thermodynamics: those states that are the most thermodynamically (energetically) favorable are the most populated. The probability that a protein (or a part of a protein) occupies a given conformation is determined by the Gibbs energy difference between conformations, i.e. the frequency with which a given molecule will xe2x80x9cvisitxe2x80x9d a given conformation is dictated by how energetically favorable the conformation is. More energetically favorable conformations are visited more frequently. The entire system exists in a state of dynamic equilibrium with individual molecules depopulating and repopulating all available conformations. At any one time, the statistical distribution of all molecules within the mix will also accord with the Gibbs energy difference between conformations, in that the more energetically favorable a conformation states will be more highly populated.
This realization, while gratifying, has also complicated the thermodynamic analysis of protein-ligand binding events. Conventional methodology which does not take into account the statistical distribution of molecular conformations is clearly inadequate. It would be highly advantageous to have available methods for the predictive analysis of ligand binding which is based on the thermodynamic assessment of the multiple conformational states of a molecule. Such methods would facilitate the design and selection of promising drug candidates at an early stage of the drug development process.
It is an object of this invention to provide a computer assisted computational method for creating and displaying a model of a molecule in which, in the model, the residues of the molecule that are affected by the binding of a ligand of interest to the molecule are highlighted. Highlighting of the affected residues permits the visualization of the path of propagation of the binding signal throughout the structure of the molecule.
The method involves the input of the three-dimensional coordinates of the molecule into the programmed computer, generating an ensemble of about 20,000 to 200,000 (or more) partially folded conformational states of the molecule, and determining the Gibbs energy (xcex94G) of each conformational state. The binding competent conformational states are then identified and the xcex94G values for those states are modified using the equation       Δ    ⁢          xe2x80x83        ⁢    Gi    =            Δ      ⁢              xe2x80x83            ⁢              Gi        0              -          RT      ⁢              xe2x80x83            ⁢      ln      ⁢                        (                                    1              +              Ka                        ,                          i              ⁡                              [                X                ]                                              )                          (                                    1              +              Ka                        ⁢                          ,              0                        ⁢                          [              X              ]                                )                    
where xcex94Gi0 is the Gibbs energy in the absence of the ligand X, (i.e. the Gibbs energy of the state as calculated in Equation 2); Ka,0 is the binding constant of the ligand to a reference (in this case, the native or template conformation); and Ka,i is the binding constant of the ligand to a given binding competent state i.
The probability of each state in the absence and in the presence of ligand is then calculated from the Gibbs energy data, and a residue level stability constant (xcexa) in the absence and in the presence of ligand is calculated for each residue of the molecule. xcexa values in the absence and in the presence of ligand are compared. Those residues in which xcexa is different in the presence vs the absence of ligand are those which are affected by the binding of ligand. A single representation of the molecule in which those affected residues are highlighted is generated and displayed. By visually observing the highlighted residues on the displayed molecule, it is possible to trace a path of propagation of the ligand-binding induced signal through the body of the molecule. Because xcexa values are numerical quantities, it is also possible to quantitate the degree of the effect of ligand binding on the individual residues.