The present invention relates to a method and program of evaluating shape similarity between two molecules.
Used as methods of designing drug molecules having desired physiological activity are logical molecular design techniques using the three-dimensional quantitative structure-activity relationships (3D-QSAR) analysis and pharmacophore mapping. In these methods, existing drugs are superimposed on each other in a three-dimensional virtual space according to appropriate rules, and statistical processing is performed on the superimposed drugs using the partial least square of latent valuables (PLS) method, neural net (NN) method, genetic algorithm (GA), or the like. Thus, features between the activity and various parameters of the drugs are extracted. The graphical display of the obtained results allows visual recognition of the parts contributing to the activity in the molecular structure (e.g. a functional group and stereo-structure) and gives a cue of molecular design. These methods are also applied to forecasting the activity of a newly designed molecule.
Techniques for superimposing molecules are important in the 3D-QSAR and the pharmacophore search. Conventionally used are the methods of: improving the degree of superimposition by gradually moving a plurality of molecules to be compared so that the root mean square of atoms or functional groups corresponding to one another is minimum; and sequentially searching the best superimposition using an evaluation function (molecular similarity).
However, in the method of superimposing the atoms or functional groups, a researcher""s subjective view is inevitable, although the superimposing operation is performed within a short period of time. The method of automatically extracting the functional groups using a computer poses a problem: the method of selecting the types and the number of functional groups to be superimposed include software-dependant arbitrariness and a researcher""s subjective view. On the other hand, the method using an evaluation function poses another problem of requiring prolonged calculation time, although it is ideal as a technique of superimposing molecules.
For example, a molecular similarity evaluation method, proposed by Carbo et al., of comparing the electron densities of molecules, requires prolonged calculation time. For this reason, another molecular similarity evaluation method using comparison of the molecular electrostatic potential (MEP) or molecular shape is proposed.
In a molecular shape similarity evaluation method that has been proposed by Meyer et al. and disclosed in the Journal of Computer-Aided Molecular Design, 5(1991), 427-439, xe2x80x9cSimilarity of Molecular Shapexe2x80x9d, molecular similarity is evaluated using the following equation:       S    AB    =      C                  (                              T            A                    ·                      T            B                          )                    1        /        2            
where TA and TB represent the volumes of two molecules A and B, respectively, and C represents the area of a part common to molecules A and B when they are superimposed on each other. When molecules A and B are completely superimposed on each other, a maximum value of +1 is given to SAB. When molecules A and B have no common part, 0 is given to SAB. The value of each volume is calculated by counting grid points that fall inside of each molecule or the van der Waals radii of both molecules, among the grid points generated at regular intervals.
However, the molecular similarity evaluation using grid points requires prolonged calculation time. Extremely fine grids (typically 0.2 xc3x85 separation) are required to obtain accurate results. Therefore, a high-speed computer, such as a workstation, is required to obtain an optimum model of superimposed molecules.
In order to address this problem, Richards et al. have calculated the Gaussian function using the distance between an atom in a template molecule and an atom in a molecule to be compared to apply the function to the calculation of molecule similarity. Refer to the Perspectives in Drug Discovery and Design, 9/10/11: 321-338, 1998, xe2x80x9cExplicit Calculation of 3D Molecular Similarityxe2x80x9d. This method can attain molecular similarity evaluation considerably faster than the method by Meyer et al. However, in the calculation of molecular similarity, the atomic distance between every atom in a molecule and every atom in the other molecule must be calculated. Therefore, the calculation still requires considerable time.
As mentioned above, the conventional molecular similarity evaluation methods require prolonged calculation time. Therefore, it has been difficult to repeatedly obtain molecule similarity, and obtain an optimum superimposition model based on the results, using such methods as the simplex optimization. The method of superimposing molecules so that one functional group is within a constant distance from the other functional group is effective especially when the pharmacophore is already clear. However, this method does not provide any excellent superimposition model of small molecules having a small number of functional groups.
The present invention, therefore, provides a method of easily evaluating molecular similarity at a high speed, and a method of obtaining an optimum superimposition model by moving molecules so as to maximize the similarity, and moreover, a program of executing these methods.
In order to address the problems discussed above, the present invention provides a molecular similarity evaluation method of evaluating the similarity between a first molecule and a second molecule, comprising the steps of:
(a) obtaining an upper threshold and a lower threshold from one of a value specific to an atom included in the first molecule, a value specific to an atom included in the second molecule of which correlation with respect to the atom in the first molecule is to be evaluated, or another value obtained from these values;
(b) calculating the correlation between the atom in the first molecule and the atom in the second molecule, using the upper and lower thresholds; and
(c) evaluating the similarity between the first and second molecules based on the correlation obtained in the step (b).
In accordance with another aspect of the present invention, there is provided a molecular similarity evaluation method of obtaining the correlation between each atom included in a first molecule and each atom included in a second molecule based on the atomic distance therebetween and evaluating the similarity between the first and second molecules based on this correlation. The operation of obtaining the correlation between the atoms includes steps of:
(a) obtaining a value specific to an atom in the first molecule and/or an atom in the second molecule;
(b) obtaining an upper threshold from the specific value obtained in the step (a);
(c) obtaining a lower threshold from the specific value obtained in the step (a);
(d) obtaining a distance between the two atoms; and
(e) evaluating the correlation between the atom in the first molecule and the atom in the second molecule, using the atomic distance obtained in the step (d), the upper threshold, and the lower threshold.
In accordance with further aspect of the present invention, there is provided a method and program of evaluating similarity between a first molecule and a second molecule disposed in a three-dimensional coordinate system, comprising steps of:
(a) resetting a correlation coefficient;
(b) selecting one atom from a plurality of atoms included in the second molecule;
(c) selecting one atom from a plurality of atoms included in the first molecule;
(d) obtaining an upper threshold and a lower threshold from a value specific to both or one of the atoms selected in each of the steps (b) and (c);
(e) obtaining a distance between the atoms selected in each of the steps (b) and (c);
(f) obtaining a correlation coefficient based on the upper and lower thresholds obtained in the step (d), according to the atomic distance obtained in the step (e);
(g) determining whether or not any atom unselected from the plurality of atoms included in the first molecule exists, after completion of the step (f);
(h) when it is determined in the step (g) that any atom unselected from the plurality of atoms included in the first molecule exists, returning to the step (c), selecting another atom unselected from the plurality of atoms in the first molecule, executing the steps (d) through (f), and updating the correlation coefficient; and
(i) when it is determined in the step (g) that any atom unselected from the plurality of atoms included in the first molecule does not exist, returning to step (a), selecting another atom unselected from the plurality of atoms in the second molecule, executing the steps (c) through (h), and adding the correlation coefficient.
Preferably, in the step (d), the relative position of the first and second molecules is changed in a direction to improve the similarity index.
In accordance with still further aspect of the present invention, there is provided a method and program of evaluating similarity of two molecules, including steps of:
(a) obtaining a three-dimensional coordinate of each atom included in first and second molecules;
(b) disposing the first and second molecules in a three-dimensional coordinate system;
(c) obtaining a similarity index of each atom included in the second molecule with respect to each atom included in the first molecule, comprising sub-steps of:
(c-0) resetting a correlation coefficient;
(c-1) selecting one atom from a plurality of atoms included in the second molecule;
(c-2) selecting one atom from a plurality of atoms included in the first molecule;
(c-3) obtaining an upper threshold and a lower threshold from values specific to both or one of the atoms selected in each of the sub-steps (c-1) and (c-2);
(c-4) obtaining a distance between the atoms selected in each of the sub-steps (c-1) and (c-2);
(c-5) obtaining a correlation coefficient according to the atomic distance obtained in the sub-step (c-4), based on the upper and lower thresholds obtained in the sub-step (c-3);
(c-6) determining whether or not any atom unselected from the plurality of atoms included in the first molecule exists, after completion of the sub-step (c-5);
(c-7) when it is determined in the sub-step (c-6) that any atom unselected from the plurality of atoms included in the first molecule exists, returning to the sub-step (c-2), selecting another atom unselected from the plurality of atoms in the first molecule, executing the sub-steps (c-3) through (c-6), and updating the correlation coefficient; and
(c-8) when it is determined in the sub-step (c-6) that any atom unselected from the plurality of atoms included in the first molecule does not exist, returning to the sub-step (c-0), selecting another atom unselected from the plurality of atoms in the second molecule, executing the sub-steps (c-2) through (c-7), and adding the correlation coefficient so as to obtain the similarity index; and
(d) changing the relative position of the first molecule and the second molecule in the three-dimensional coordinate system, after the completion of the step (c).
The atomic distance obtained in the above-mentioned plurality of aspects can be a distance with respect to one of axes in the Cartesian coordinate system.
For the value specific to an atom, one of the van der Waals radius, atomic radius, and covalent radius of the atom can be used.
Preferably, the upper threshold is obtained by multiplying a value specific to the atom by an upper limit constant, and the lower threshold is obtained by multiplying a value specific to the atom by a lower limit constant. Preferably, the upper limit constant is a value ranging from 1.0 to 1.3 and the lower limit constant is a value ranging from 0.5 to 0.8.