1. Field of the Invention
The present invention relates to providing molecular information, and more particularly to a molecular information providing system, a molecular information providing apparatus, a molecular information providing method, a method for controlling an information processing unit as the molecular information providing apparatus, a program for implementing the method in the information processing unit, a mechanically readable storage medium storing the program, and a grid computing support device for computing the molecular orbital, in which the molecular information can be shared by generating an intermediate representation from an atomic arrangement notation to provide high precision information without depending on a format of the atomic arrangement notation from a terminal unit.
2. Description of the Related Art
In recent chemical studies, many designs of molecules having desired characteristics have been made by using computer-aided quantum chemistry calculation to predict characteristics of molecules. In this case, a variety of quantum chemistry computation methods have been well known, including CNDO, CND/S, INDO, MINDO, MINDO3, MINDO5, HF, and RHF, whether empirical or non-empirical, to perform the molecular orbital computation. The above molecular orbital computation includes generating a molecular orbital from an atomic orbital, using an LCAO (Linear Combination of Atomic Orbital) method, wherein an coefficient matrix of eigen-equation having the molecular orbital energy at the diagonal element is transformed into diagonal form, and the molecular orbital as an energy eigenvalue and its corresponding eigenvector are generated by iteration computation. In the above iteration computation, it is well known that the amount of computation is greatly increased as the number of atoms is increased, whereby enormous computer resources such as the CPU occupying time and memory are required.
Examples of a molecular orbital computation software in which the molecular orbital computation is performed employing an empirical or non-empirical method to provide its results to the user may include a MOPAC program package with a semi-empirical computation method, and a GAUSSIAN (trademark) structure modeling program package, commercially available from GAUSSIAN, Inc. of Wallingford, Conn. A molecular orbital program is not limited in terms of the number of atoms in principle, as far as hardware resources are allowable, but a semi-empirical molecular orbital computation method such as MOPAC is applied to the molecules having a relatively great number of atoms, and the analysis object for a non-empirical molecular orbital computation such as GAUSSIAN is often applied to the molecules having a smaller number of atoms due to the limitation of hardware resources. As the molecular orbital computation deals with a smaller number of atoms in the molecule, the same computation for the same molecule is expected to be performed by more users.
For the molecules having a great number of atoms, to which the semi-empirical molecular orbital computation such as MOPAC is mainly performed conventionally, there is the possibility that a molecule having a specific feature is computed at any site, and the knowledge about the desired characteristic of the molecule is possibly accumulated with any computer in the world, although not publicized.
Accordingly, if the computation result obtained using the molecular orbital computation such as GAUSSIAN or MOPAC is accumulated in a common database, it is possible to input the molecular structure and retrieve the data having the same molecular structure as the input molecular structure from the database rapidly and accurately. By employing the above database, it is possible to provide the more accurate result more rapidly than making the computation using the limited computer resources at each terminal computer. Accordingly, if the analysis results of the molecular orbital computation are shared, the computer resources are saved and the computation cost is reduced, making it possible to acquire promptly the information such as the molecular structure and electron structure by molecular orbital computation, reactivity, effect of medicine, side reaction, and electrical, electronic or optical characteristic. Besides the molecular orbital method, if the molecular data having the characteristics associated with the molecular structure such as material design or analysis are shared and retrieved at high precision, the labor of the user is reduced.
Further, there is a greater advantage of sharing the information of the database when more users gain access to the database, typically in the environment of grid computing. For example, it is said that half or more of the computation jobs by the users all over the world to make the molecular orbital computation employing a GAUSSIAN program package are substantially duplicated. Therefore, it is preferable to share the computation results already obtained to achieve more effective use of the computer resources.
In the computation regarding the shape or structure, there is a method for computing the shape in terms of a sequence of points in the field of computer graphics, in addition to the computational chemistry (molecular orbital method), in which this method may be applied to the computation of molecule. However, the atoms making up the sequence of points in the computational chemistry contain the atomic attribute called an atomic number (atomic weight), besides the positional information, causing another problem. For example, even if the molecules have no proximate root (asymmetry) in the shape, the proximate multiple root may be recognized in the moment of inertia, irrespective of asymmetry in the shape, when the moment of inertia is computed from the molecule structure. In the computational chemistry, the atomic number is an important value representing the bond between elements, and it is not appropriate to arbitrarily change the atomic number for the structure comparison.
In the molecular orbital method, the molecular structure is denoted employing an atomic arrangement notation as the general representation of atomic arrangement, for example, H6C6 for benzene, in which the combinations of atomic symbol and number of atoms are arranged in the order of atomic number. Accordingly, it is needed to find the molecule having the same atomic arrangement notation and the consistent molecular structure from the database to make a comparison between the molecular structures for use in the molecular orbital computation. More specifically, it is necessary to compare the coordinate values of each atom in the molecules with the same atomic arrangement notation. However, it is often meaningless to compare the coordinate values themselves, because the representation method of the molecular structure has various input formats or coordinate systems and a limited number of significant digits. The user acquires the positional coordinates of atoms making up the molecule by various methods, then transforms them into a proper coordinate system employed by the user, a Cartesian coordinate system in most cases, or the atomic arrangement notation in a Z matrix format as will be described later, to make the computation by the molecular orbital method. Therefore, it is required to transform the molecular structure into a representation system (uniquely decided from the physical properties of the molecular structure) that is not dependent on the input format or coordinate system employed by the user.
In the molecular orbital method described above, there is an attempt for avoiding duplication of the molecular orbital computation for the molecules which have been already dealt with for analysis computation by comparing the input data specifying the molecular structure and the molecular structure data accumulated in the database and giving the analysis result. More specifically, the computation data is input in the interactive way, and the comparison of molecular structure is made between the input data of molecular structure input on the text basis and the positional coordinates on the text basis registered in the database by determining whether or not they are coincident in the text level.
Though the above retrieval method is well known, the computation result obtained by the molecular orbital method has many kinds of parameters and various combinations of them. When the residuals of coordinates of atom in the molecular structure are calculated sequentially on the text basis to identify the molecule as a sum of residuals, a determination is made employing the total of input positional deviations. Hence, when a plurality of candidate molecules with the same amount of positional deviations are selected, it is required to make a determination of which structure to select, including a round-off error in the computer. Therefore, the retrieved result may be graphically presented to the user for determination. However, if the user makes a determination graphically, a problem arises that the precision of selection is degraded, and there is some uncertainty in selection. Therefore, when the molecular orbital computation was made by grid computing, there was a need for a packaging method for comparing the molecular structures using a representation system more clearly reflecting the molecular structure than comparing them sequentially on the text basis to simplify the understandings of the molecular structure, and provide the information promptly and precisely.