The atoms that construct a protein gradually change positions, which means that the structure (or “conformation”) of a protein changes over time technique used for a computer simulation of such structural changes in a protein is called “molecular dynamics” (MD) simulation.
In a MD simulation, the forces applied to respective, atoms disposed in an initial state from other atoms are calculated, and the movements of the atoms subjected to such forces are calculated based on Newton equation of motion. By doing so, the arrangement of the atoms after a certain time has elapsed from the initial arrangement is calculated. By repeating such calculation using a computer, it is possible to reproduce structural changes of a protein, which is useful for example in functional analysis of proteins.
A variety of methods have been proposed to reproduce structural changes in a protein by way of a MD simulation. One example of a method for reproducing structural changes in a protein using a MD simulation is 1.0 called “OFLOOD” where structural changes in a protein are sampled based on the detection of outliers.
With OFLOOD, clustering is performed on time series data of atomic coordinates or “trajectories”) generated by MD simulation. Here, a “trajectory” is a group of atomic coordinates in a protein that can change over time. When OFLOOD is used, out of the protein structures included in a trajectory, a protein structure not included in any of the clusters is acquired as an “outlier”. OFLOOD then executes a short-time MD simulation once again for the outlier. By doing so, it is possible to analyze functional changes in appropriate biomolecules with sufficient consideration to protein structures that rarely occur.
Note that a clustering method called “FlexDice” is used for the clustering performed during OFLOOD. FlexDice is a clustering method that gathers data elements in dense regions that are separated by sparse regions in a multi-dimensional data space in real time.
There are a variety of other technologies for analyzing protein structures. As one example, an analysis method that deconstructs into uncorrelated modes of fluctuation and extracts relevant modes of fluctuation with slow relaxation that lead to large-scale structural changes in a protein has been proposed. A method that detects collective motions in proteins by analyzing independent subspaces has also been proposed. A method for efficiently generating and screening protein libraries for optimized proteins with desirable biological functions has also been proposed. A technique that makes it possible to efficiently find functional peptides is also conceivable. In addition, a method that saves, searches, compares, and analyzes complex carbohydrates by expressing the complex carbohydrates using simple linear codes is conceivable.
See, for example, the following documents:
Japanese Laid-Open Patent Publication No. 2010-88451;
Japanese Laid-Open Patent Publication No. 2010-222300;
Japanese National Publication of international Patent Application No. 2004-505334;
Ryuhei Harada, Tomotake Nakamura, Yu Takano, and Yasuteru Shigeta, “Protein Folding Pathways Extracted by OFLOOD: Outlier FLOODing Method” Journal of Computational Chemistry, Jan. 15, 2015, Volume 36, Issue 2, pages 97-102;
Tomotake Nakamura, Yoko Kamidoi, Shinichi Wakabayashi, and Noriyoshi Yoshida, “FlexDice: A Fast Clustering Method for Large High Dimensional Data Sets”, Journal of Information Processing Society of Japan, Database, Vol. 46, NO. SIG 18, pp. 40-49, December, 2005;
Yusuke Naritomi and Sotaro Fuohigami, “Slow dynamics in protein fluctuations revealed by time structure based independent component analysis: The case of domain motions”, The Journal of Chemical Physics 134, 065101, 2011 Feb. 14; and
Shun Sakuraba, Yasumasa Joti, Akio Kitao, “Detecting coupled collective motions in protein by independent subspace analysis”, The Journal of chemical physics 133, 185102, 2010 Nov. 14.
When carrying out structural analysis of a protein using a technique such as OFLOOD, unless appropriate reaction coordinates (dimensions) are selected as the analysis coordinates, it will not be possible to extract structural changes in the protein under study. As examples, a dimension used in analysis may be the coordinate of a specific atom on a specific axis or may be the distance between two specified atoms. In the past, when performing structural analysis of proteins, relevant dimensions that are known to certain extent from experience have been used.
However, there is also the possibility that other relevant dimensions will exist in addition to the relevant dimensions that are already known. At present, there is no effective method of finding such unknown relevant dimensions. To effectively analyze structural changes of proteins, it is important to perform structural analysis of proteins having reliably selected the relevant dimensions.
The above problem relating to the difficulty in selecting the relevant dimensions to be used in structural analysis of a substance, is not limited to proteins and also applies to all structural analysis of substances with changing structures (for example, biomolecules aside from proteins).