The present invention relates to a method for calculating the binding free energy for interactions between biomolecules. More particularly, the present invention relates to a method that employs computational neural networks to discover quantum mechanical features of enzyme active site transition states, as well as quantum mechanical features required for binding of putative enzyme inhibitors. The present method is also applicable to discovering quantum mechanical features required for the binding of a potential ligand to a biological receptor. Computer-readable media may be incorporated with information enabling the method of the present invention to be performed on a general-purpose computer.
Enzymatically catalyzed reactions are characterized by geometric and electrostatic distortions of a substrate molecule into a transition state. The formation and stabilization of these transition states by enzymes are accompanied by increases in the rate of catalysis on the order of 1010-1015 times faster than the uncatalyzed reaction. It is thought that an enzyme binds the transition state of a substrate molecule more tightly than either the substrate or the product. As a result, chemically stable molecules that mimic the substrate transition state should be potent inhibitors of enzyme activity.
The de novo design of transition state inhibitors requires accurate models of the enzyme-stabilized transition state. Advances in theory and computational chemistry have produced good models of stable molecules and enzymatic transition states from kinetic isotope effect experiments. However, the development of computational and theoretical methods of prediction of the binding constant of a putative inhibitor, prior to synthesis, would facilitate the discovery of novel inhibitors. These methods could be used to search chemical libraries for transition state mimics.
There is a long history of the development of methods to predict the binding of biological agents to enzymes or receptors. The methods generally fall into one of two categories. The first is the use of docking or molecular dynamics studies to investigate the interactions of substrates with a variety of biological molecules, such as enzymes or receptor sites. The second is the use of Quantitative Structure Activity Relations (hereinafter xe2x80x9cQSARsxe2x80x9d), which usually investigate the properties of potential therapeutic agents in the absence of their biological target. Each of these methods have advantages and disadvantages.
The concept of docking or molecular dynamics studies is that application of physical laws of motion or static interaction can be applied to biological systems to predict the strength of interaction of a substrate with a complex biological molecule. In general, biological macromolecules and their substrates form a system too large for ab initio quantum chemical methods to be used to generate electronic potential energies, and so parametrized classical force fields are employed on which classical mechanics simulations can be run. This is a massive technology with an equally huge literature. A variety of algorithms have been developed with allow efficient integration of Newton""s equations. In addition, Monte Carlo methods are of critical importance in this field. There have also been recent advances in mixing quantum and classical mechanics, such as the surface hopping methods in chemical physics. Because even this calculation is challenging for complex systems, static methods, which treat parts of a system as dielectric continua, have been employed. These approaches have been employed in docking studies, in which substrates are virtually oriented and bound to an active site of an enzyme or other biomolecule.
As important as these approaches are, there are still difficulties in the application of these approaches to drug design and analysis. First, there are a great many approximations inherent in the development of force field and dielectric continua models. Though these methods have been studied for many years, it is still difficult to know which approximations are appropriate. Second, these calculations are difficult to perform, even when the approximations-are appropriately employed, and require significant computer resources. As a result, the use of such docking studies in searches of libraries of potential candidate drugs for interactions with a particular target enzyme is impractical. Third, a structure for the biomolecule of interest is required for these docking studies. If, for example, only a DNA sequence is available, these methods cannot be employed.
The second method for predicting biological activity, QSARs, focuses on the substrate molecule. The concept assumes that there is a database of experimental evidence from which inferences can be drawn as to the effectiveness of other molecules. Specific properties of substrate molecules, such as hydrophobicity, the presence of certain groups, steric parameters,etc., are empirically fit to experimentally determined biological activities. The assumption is that once this fitting is appropriately performed, an accurate prediction of biological activity of a previously untested molecule can be made by examining the molecule for the same properties. These predictions have become quite sophisticated, and structural parameters used gave been expanded to include quantum mechanical features of substrates such as electrostatic potential surfaces. For example, point-by-point comparison of quantum mechanical electrostatic potentials on molecular van der Waals surfaces has been used to predict inhibition constants for transition state inhibitors for the reaction catalyzed by AMP deaminase and AMP nucleosidase. While these approaches are powerful, there are a number of difficulties. First, one must be able to identify a specific feature that determines bioactivity. When multiple features are involved, it is difficult to determine how the interactions of these features affects bioactivity. Second, a single QSAR may not be able to identify bioactivity trends when a number of different mechanisms are present (as where, for example, an enzyme protonates some substrates but not others). Finally, in addition to the practical prediction of bioactivity among libraries of potentially bioactive compounds, it is desirable to develop a theoretical approach that can help identify the features of the candidate molecules that are important, and thereby help to elucidate unknown mechanisms in the bioactivity.
The present invention is based on prior work on molecular similarity measures which compare electrostatic potential surfaces on the van der Waals surface of two different molecules. Two different molecules having similar electrostatic potentials have been found to have similar binding properties. Therefore, strong electrostatic similarity to an experimentally determined transition state result in strong binding, and a powerful transition state inhibitor. The similarity measure is defined as follows:       S    e    =                    ∑                  i          =          1                          n          ⁢                      xe2x80x83                    ⁢          A                    ⁢                        ∑                      j            =            1                    nB                ⁢                              ϵ            i            A                    ⁢                      ϵ            j            B                    ⁢                      exp            ⁡                          (                                                -                  α                                ⁢                                  xe2x80x83                                ⁢                                  r                  ij                  2                                            )                                                                                ∑                          i              =              1                                      n              ⁢                              xe2x80x83                            ⁢              A                                ⁢                                    ∑                              j                =                1                                            n                ⁢                                  xe2x80x83                                ⁢                A                                      ⁢                                          ϵ                i                A                            ⁢                              ϵ                j                A                            ⁢                              exp                ⁡                                  (                                                            -                      α                                        ⁢                                          xe2x80x83                                        ⁢                                          r                      ij                      2                                                        )                                                                        xc3x97                                    ∑                          i              =              1                                      n              ⁢                              xe2x80x83                            ⁢              B                                ⁢                                    ∑                              j                =                1                            nB                        ⁢                                          ϵ                i                B                            ⁢                              ϵ                j                B                            ⁢                              exp                ⁡                                  (                                                            -                      α                                        ⁢                                          xe2x80x83                                        ⁢                                          r                      ij                      2                                                        )                                                                        
wherein xcex5Ai is an electrostatic potential on molecule A at position i, xcex5Bi is an electrostatic potential on molecule B at position i, rij is a distance between points i and j on a surface, and xcex1 is a decay constant employed so that points that are very far apart do not strongly affect the similarity measure. The similarity measures were first applied to transition state inhibitors for the reactions catalyzed by AMP deaminase, adenosine deaminase, and AMP nucleosidase. Transition state structures for each enzyme were obtained by kinetic isotope experiments (Kline et al., J. Biol Chem. 269:22385-22390(1994); Ehrlich et al., Biochem. 33:8890-8896 (1994)). Electrostatic potentials were then calculated for the transition states, the substrates, and the putative inhibitors. These were obtained using the GAUSSIAN 94 quantum chemistry package (from Gaussian, Inc., Pittsburgh, Pa.). Minimal basis sets (STO3G) were used in the initial studies, and were thereafter confirmed using higher order basis sets. The molecules were oriented with respect to each other to maximize geometric overlap, and the electrostatic similarities were calculated.
For these simple reactions, the calculated numerical similarity was a reasonable predictor of binding free energy. FIG. 1 shows a plot of experimentally determined binding free energy versus electrostatic similarity (Se) for the AMP nucleosidase reaction and for three transition state inhibitors. The three inhibitors fall reasonably closely to a line defined by connecting the binding free energy versus Se for the substrate AMP and the transition state. Similarly, FIG. 2 shows a plot of experimentally determined binding free energy versus electrostatic similarity for adenosine deaminase. One transition state inhibitor, 1,6-dihydropurine ribonucleoside is a significant outlier, but the remaining results are quite good.
Despite these encouraging initial results, it soon became apparent that the similarity measure does not accurately predict all cases of inhibitor enzyme binding. The reason is that the Se treats all points on the van der Waals surface equivalently. It is thus quite possible to have a perfect configuration for binding in the region of an inhibitor molecule that interacts with the active site, but have significant differences remote from this site. As a result, the similarity measure would produce a result that would predict weaker binding than the actual binding free energy. Similarly, a molecule that initially looks very different from a transition state inhibitor could be changed electrostatically by its interaction with the enzyme, by e.g., protonation, to a form that might have a high binding free energy. Again, the similarity measure would predict a weaker binding than what would be measured experimentally. Furthermore, there are many reasonable algebraic similarity measures that may be applied in each cases and choosing the most appropriate measure would require extensive computational resources.
An artificial neural network is a computer algorithm which, during a training process, can learn features of input patterns and associate these with an output. After the learning phase is completed, the trained network enables the computer to predict an output for a pattern not included in the training process. Neural networks have been used in a small number of cases to study biological activity prediction. For example, Kohonen self-organizing organizing maps have been used to transform the three-dimensional surface of biomolecules to a two-dimensional projection (Gasteiger et al., J. Am. Chem. Soc. 116:4608-4620 (1994)). Similarly, the molecular electrostatic potential at the van der Waals surface has been collapsed onto a series of 12 autocorrelation coefficients, and these were used in a neural network (Wagener et al., J. Am. Chem. Soc. 117:7769-7775 (1995)). In both these cases, potentially useful data were discarded as the three-dimensional surface information was converted to a two-dimensional representation. Neural networks have also been used to predict the mode of action of chemotherapeutic agents (Weinsteinet al., Stem Cells 12:13-22 (1994)). Finally, neural networks have been used to predict biological activity from discrete QSAR descriptions of molecular structure (So and Richards, J. Med. Chem. 35:3201-3207 (1992)). However, this approach fails if the correct QSAR is not selected.
It is therefore desirable to have a method that can accurately predict binding free energy for a wide variety of potential inhibitors. It is also desirable to have a method for determination of binding free energy that identifies those regions of a potential inhibitor or other bioactive molecule that are especially important in binding, and thereby help elucidate unknown binding features. Furthermore, it is desirable to have a method for determining binding free energy that would adjust itself in each case to the form most suited to that particular case.
Accordingly, it is an object of the present invention to overcome the limitations of the prior art.
It is another object of the present to provide a method for determining the free energy of binding of a substrate of known structure to an enzyme.
It is another object of the present invention to provide a method for determining the free energy of binding of an inhibitor of known structure to an enzyme.
It is another object of the present invention to provide a method for determining the free energy of binding of a ligand to a receptor.
It is another object of the present invention to provide a computer-readable medium encoded with information that enables a general purpose computer to perform the method of the present invention.
Briefly stated, a new method to analyze and predict the binding energy for enzyme-transition transition state inhibitor interactions is presented. Computational neural networks are employed to discovery quantum mechanical features of transition states and putative inhibitors necessary for binding. The method is able to generate its own relationship between the quantum mechanical structure of the inhibitor and the strength of binding. Feed-forward neural networks with back propagation of error can be trained to recognize the quantum mechanical electrostatic potential at the entire van der Waals surface, rather than a collapsed representation, of a group of training inhibitors and to predict the strength of interactions between the enzyme and a group of novel inhibitors. The experimental results show that the neural networks can predict with quantitative accuracy the binding strength of new inhibitors. The method is in fact able to predict the large binding free energy of the transition state, when trained with less tightly bound inhibitors. The present method is also applicable to prediction of the binding free energy of a ligand to a receptor. The application of this approach to the study of transition state inhibitors and ligands would permit evaluation of chemical libraries of potential inhibitory, agonistic, or antagonistic agents. The method is amenable to incorporation in a computer-readable medium accessible by general-purpose computers.
According to an embodiment of the present invention, a method for determining the free energy of binding of a potential ligand to a receptor comprises the steps of obtaining, for each of two or more actual receptor ligands, at least one of a structure and a free energy of binding to the receptor, such that each of the two or more actual receptor ligands has a known structure and a known free energy of binding to the receptor, orienting the structures of the two or more actual receptor ligands for maximum geometric coincidence with each other, determining an electrostatic potential at each of more than one point on a van der Waals surface of each of the actual receptor ligands, thereafter, mapping each of the electrostatic potentials of each of the actual receptor ligands onto a geometric surface of one of the two or more actual receptor ligands, each of the two or more actual receptor ligands being thereby described by an identical surface geometry but a different electrostatic potential surface, and each of the electrostatic potentials being described by positional information relating the electrostatic potentials to the geometric surface, thereafter, inputting the electrostatic potentials, the positional information, and the known free energy of binding of one of the two or more actual receptor ligands into a neural network, thereafter, training the neural network until the neural network predicts the free energy of binding of the one of the two or more actual receptor ligands, repeating the steps of inputting and training for each of the remaining the two or more actual receptor ligands to produce a trained network, thereafter, determining a potential ligand electrostatic potential at each of more than one point on a van der Waals surface of the potential ligand, the potential ligand having a known structure and an unknown free energy of binding to the receptor, orienting the structure of the potential ligand for maximum geometric coincidence with the structures of the two or more actual receptor ligands, thereafter, mapping each of the electrostatic potentials of the potential ligand onto a geometric surface of one of the two or more actual receptor ligands, the potential ligand having a surface geometry identical to that of the two or more actual receptor ligands, but a different electrostatic potential surface, and each of the electrostatic potentials of the potential ligand being described by positional information relating the electrostatic potentials to the geometric surface, thereafter, inputting the electrostatic potentials and the positional information of the electrostatic potentials of the potential ligand into the trained network, and using the trained network to calculate a free energy of binding of the potential ligand to the receptor.
According to another embodiment of the present invention, a method for determining the free energy of binding of a potential ligand to a receptor comprises the steps of obtaining a structure for the potential ligand, orienting structures of two or more actual receptor ligands for the receptor for maximum geometric coincidence with each other, each of the two or more actual receptor ligands having a known structure and a known free energy of binding to the receptor, determining an electrostatic potential at each of more than one point on a van der Waals surface of each of the actual receptor ligands, thereafter, mapping each of the electrostatic potentials of each of the actual receptor ligands onto a geometric surface of one of the two or more actual receptor ligands, each of the two or more actual receptor ligands being thereby described by an identical surface geometry but a different electrostatic potential surface, and each of the electrostatic potentials being described by positional information relating the electrostatic potentials to the geometric surface, thereafter, inputting the electrostatic potentials, the positional information, and the known free energy of binding of one of the two or more actual receptor ligands into a neural network, thereafter, training the neural network until the neural network predicts the free energy of binding of the one of the two or more actual receptor ligands, repeating the steps of inputting and training for each of the remaining the two or more actual receptor ligands to produce a trained network, thereafter, determining an potential ligand electrostatic potential at each of more than one point on a van der Waals surface of the potential ligand, the potential ligand having an unknown free energy of binding to the receptor, orienting the structure of the potential ligand for maximum geometric coincidence with the structures of the two or more actual receptor ligands, thereafter, mapping each of the electrostatic potentials of the potential ligand onto a geometric surface of one of the two or more actual receptor ligands, the potential ligand having a surface geometry identical to that of the two or more actual receptor ligands, but a different electrostatic potential surface, and each of the electrostatic potentials of the potential ligand being described by positional information relating the electrostatic potentials to the geometric surface, thereafter, inputting the electrostatic potentials and the positional information of the electrostatic potentials of the potential ligand into the trained network, and using the trained network to calculate a free energy of binding of the potential ligand to the receptor.
According to another embodiment of the present invention, a computer readable medium comprises computer-readable information, the information capable of interacting with a computer to produce an output, the output being a calculated free energy of binding of a potential ligand to a receptor, the output being calculated by orienting structures of the two or more actual receptor ligands for maximum geometric coincidence with each other, each of the two or more actual receptor ligands having a known structure and a known free energy of binding to the receptor, determining an electrostatic potential at each of more than one point on a van der Waals surface of each of the actual receptor ligands, thereafter, mapping each of the electrostatic potentials of each of the actual receptor ligands onto a geometric surface of one of the two or more actual receptor ligands, each of the two or more actual receptor ligands being thereby described by an identical surface geometry but a different electrostatic potential surface, and each of the electrostatic potentials being described by positional information relating the electrostatic potentials to the geometric surface, thereafter, inputting the electrostatic potentials, the positional information, and the known free energy of binding of one of the two or more actual receptor ligands into a neural network, thereafter, training the neural network until the neural network predicts the free energy of binding of the one of the two or more actual receptor ligands, repeating the steps of inputting and training for each of the remaining the two or more actual receptor ligands to produce a trained network, thereafter, determining an potential ligand electrostatic potential at each of more than one point on a van der Waals surface of the potential ligand, the potential ligand having a known structure and an unknown free energy of binding to the receptor, orienting the structure of the potential ligand for maximum geometric coincidence with the structures of the two or more actual receptor ligands, thereafter mapping each of the electrostatic potentials of the potential ligand onto a geometric surface of one of the two or more actual receptor ligands, the potential ligand having a surface geometry identical to that of the two or more actual receptor ligands, but a different electrostatic potential surface, and each of the electrostatic potentials of the potential ligand being described by positional information relating the electrostatic potentials to the geometric surface, thereafter, inputting the electrostatic potentials and the positional information of the electrostatic potentials of the potential ligand into the trained network, and using the trained network to calculate a free energy of binding of the potential ligand to the receptor.
According to another embodiment of the present invention, a method for determining a free energy of binding of a potential transition-state inhibitor to an enzyme comprises the steps of obtaining, for each of two or more enzyme substrates or inhibitors, at least one of a structure and a free energy of binding to the enzyme, such that each of the two or more enzyme substrates or inhibitors has a known structure and a known free energy of binding to the enzyme, orienting the structures of the two or more enzyme substrates or inhibitors for maximum geometric coincidence with each other, determining an electrostatic potential at each of more than one point on a van der Waals surface of each of the enzyme substrates or inhibitors, thereafter, mapping each of the electrostatic potentials of each of the enzyme substrates or inhibitors onto a geometric surface of a transition state inhibitor, each of the enzyme substrates or inhibitors being thereby described by an identical surface geometry but a different electrostatic potential surface, and each of the electrostatic potentials being described by positional information relating the electrostatic potentials to the geometric surface of the transition state inhibitor, thereafter, inputting the electrostatic potentials, the positional information, and the known free energy of binding of one of the two or more enzyme substrates or inhibitors into a neural network, thereafter, training the neural network until the neural network predicts the free energy of binding of the one of the two or more enzyme substrates or inhibitors, repeating the steps of inputting and training for each of the remaining the two or more enzyme substrates or inhibitors to produce a trained network, thereafter, determining an potential transition electrostatic potential at each of more than one point on a van der Waals surface of the potential transition-state inhibitor, the potential transition-state inhibitor having a known structure and an unknown free energy of binding to the enzyme, orienting the structure of the potential transition-state inhibitor for maximum geometric coincidence with the structures of the two or more enzyme substrates or inhibitors, thereafter, mapping each of the electrostatic potentials of the potential transition-state inhibitor onto a geometric surface of one of the two or more two or more enzyme substrates or inhibitors, such that the potential transition-state inhibitor has a surface geometry identical to that of the two or more actual receptor transition-state inhibitors, but a different electrostatic potential surface, and each of the electrostatic potentials of the potential transition-state inhibitor is described by positional information relating the electrostatic potentials to the geometric surface of the two or more enzyme substrates or inhibitors, thereafter, inputting the electrostatic potentials and the positional information of the electrostatic potentials of the potential transition-state inhibitor into the trained network, and using the trained network to calculate a free energy of binding of the potential transition-state inhibitor to the enzyme.
According to another embodiment of the present invention, a method for determining the free energy of binding of a potential transition-state inhibitor to a enzyme comprises the steps of obtaining a structure for the potential transition-state inhibitor, orienting structures of two or more enzyme substrates or inhibitors for the enzyme for maximum geometric coincidence with each other, each of the two or more enzyme substrates or inhibitors having a known structure and a known free energy of binding to the enzyme, determining an electrostatic potential at each of more than one point on a van der Waals surface of each of the enzyme substrates or inhibitors, thereafter, mapping each of the electrostatic potentials of each of the enzyme substrates or inhibitors onto a geometric surface of one of the two or more enzyme substrates or inhibitors, each of the two or more enzyme substrates or inhibitors being thereby described by an identical surface geometry but a different electrostatic potential surface, and each of the electrostatic potentials being described by positional information relating the electrostatic potentials to the geometric surface, thereafter, inputting the electrostatic potentials, the positional information, and the known free energy of binding of one of the two or more enzyme substrates or inhibitors into a neural network, thereafter, training the neural network until the neural network predicts the free energy of binding of the one of the two or more enzyme substrates or inhibitors, repeating the steps of inputting and training for each of the remaining the two or more enzyme substrates or inhibitors to produce a trained network, thereafter, determining an potential transition-state inhibitor electrostatic potential at each of more than one point on a van der Waals surface of the potential transition-state inhibitor, the potential transition-state inhibitor having an unknown free energy of binding to the enzyme, orienting the structure of the potential transition-state inhibitor for maximum geometric coincidence with the structures of the two or more enzyme substrates or inhibitors, thereafter, mapping each of the electrostatic potentials of the potential transition-state inhibitor onto a geometric surface of one of the two or more enzyme substrates or inhibitors, the potential transition-state inhibitor having a surface geometry identical to that of the two or more enzyme substrates or inhibitors, but a different electrostatic potential surface, and each of the electrostatic potentials of the potential transition-state inhibitor being described by positional information relating the electrostatic potentials to the geometric surface, thereafter, inputting the electrostatic potentials and the positional information of the electrostatic potentials of the potential transition-state inhibitor into the trained network, and using the trained network to calculate a free energy of binding of the potential transition-state inhibitor to the enzyme.
According to another embodiment of the present invention, a computer readable medium comprises computer-readable information, the information capable of interacting with a computer to produce an output, the output being a calculated free energy of binding of a potential transition-state inhibitor to a enzyme, the output being calculated by orienting structures of the two or more actual receptor ligands for maximum geometric coincidence with each other, each of the two or more actual ligands having a known structure and a known free energy of binding to the enzyme, determining an electrostatic potential at each of more than one point on a van der Waals surface of each of the enzyme substrates or inhibitors, thereafter, mapping each of the electrostatic potentials of each of the enzyme substrates or inhibitors onto a geometric surface of one of the two or more enzyme substrates or inhibitors, each of the two or more enzyme substrates or inhibitors being thereby described by an identical surface geometry but a different electrostatic potential surface, and each of the electrostatic potentials being described by positional information relating the electrostatic potentials to the geometric surface, thereafter, inputting the electrostatic potentials, the positional information, and the known free energy of binding of one of the two or more enzyme substrates or inhibitors into a neural network, thereafter, training the neural network until the neural network predicts the free energy of binding of the one of the two or more enzyme substrates or inhibitors, repeating the steps of inputting and training for each of the remaining the two or more enzyme substrates or inhibitors to produce a trained network, thereafter, determining an potential transition-state inhibitor electrostatic potential at each of more than one point on a van der Waals surface of the potential receptor ligand, the potential receptor ligand having a known structure and an unknown free energy of binding to the enzyme, orienting the structure of the potential transition-state inhibitor for maximum geometric coincidence with the structures of the two or more enzyme substrates or inhibitors, thereafter, mapping each of the electrostatic potentials of the potential transition-state inhibitor onto a geometric surface of one of the two or more enzyme substrates or inhibitors, the potential transition-state inhibitor having a surface geometry identical to that of the two or more enzyme substrates or inhibitors, but a different electrostatic potential surface, and each of the electrostatic potentials of the potential transition-state inhibitor being described by positional information relating the electrostatic potentials to the geometric surface, thereafter, inputting the electrostatic potentials and the positional information of the electrostatic potentials of the potential transition-state inhibitor into the trained network, and using the trained network to calculate a free energy of binding of the potential transition-state inhibitor to the enzyme.
Additional advantages of the present invention will be apparent from the description which follows.