The present invention relates to methods for determining and refining three dimensional structures of molecules, and more particularly a method for determining and refining three dimensional structures of molecules by non-linear recursive filtering of chemical information and structural observations.
Accurate knowledge of the three dimensional structure of macro-molecules is essential for modeling their biological functions and, thus, for structure-based drug design. Nuclear Magnetic Resonance (hereinafter referred to as NMR) technology makes it possible to measure quantitative characteristics of macro-molecules in solution. At the same time, this poses new challenges for the problem of converting the NMR data into a three dimensional structure. It is well understood that the NMR data, for example, Nuclear Overhauser Effect (hereinafter referred to as NOE) intensities have a complicated nonlinear structure arising from multiple-spin-effect, which is also known as the spin diffusion problem. Simplification of that information by attempting to un-couple spins leads to loss of accuracy. Direct refinement of the three dimensional structure by the NOE intensities poses, however, significant computational difficulties for large molecules, even within the current limit on the size of molecules accessible by NMR experiments. Also, the complexity of the measurement model is aggravated by various deficiencies in data such as experimental noise, uncertainties in the assignments of proton resonances and uncertainties in relaxation parameters and contributions from dynamical effects.
Traditionally, the problem of obtaining three dimensional structures from pre-processed experimental NMR data is divided into the distinct steps of determination and refinement. During determination, an initial family of structures is generated that approximately satisfy covalent and experimental restraints. Each member of the structural family is then refined against the experimental data to increase the final accuracy. This conceptual division is convenient due to the multiple-minima nature of a global functional representation of the optimization target for the entire structure. Historically, the algorithmic methods used during the determination step were distinct from the methods used during the refinement step. However, the dividing line can be blurred if algorithms are able to systematically handle the multiple-minima problem.
In NMR applications, the method of structure optimization is substantially defined by how the NMR data is treated. A significant effort in determination and refinement techniques of the NMR structures is aimed at utilizing more complete information from experimental NOE intensities. The results can be split into indirect and direct approaches. The indirect approach offers better interpretation of NOE intensities in terms of distances rather than the simplest classification of distances attributed to strong, medium and weak NOE peaks. The indirect approach is usually used at both determination and refinement stages. Various methods offer ways to smooth distances by triangle bound inequalities or by more accurate, but computationally intensive, tetrangle inequalities. In the indirect approach, relaxation matrix calculations are also used to account for all magnetization transfers, also known as spin diffusion, resulting in more accurate distance representations. The target distances can be iteratively modified manually or automatically to match observed and calculated two dimensional NOE intensities. Another approach is based on the back-transformation of the matrix of intensities to obtain the relaxation matrix and hence, the cross-relaxation rates. In practice the experimental NOE matrix is incompletely known due to overlap and missing assignments. This problem can be circumvented by iterative construction of a "hybrid" NOE matrix comprised of experimental and calculated intensities. A problem with indirect methods is the approximate nature of the distance "measurements" which do not account properly for spin diffusion effects.
The direct approach directly incorporates NOE intensities as experimental constraints in powerful restrained dynamics simulations. This approach does not require the complete NOE matrix and provides significant improvement in accuracy due to direct accounting for spin diffusion effects. The main problem with the direct approach is that computing the gradient of the experimental potential, i.e. the gradient of each NOE intensity is very computationally intensive and needs to be done for each member of the structural family generated during the determination step, and for each time step in the restrained dynamics simulation.
Modern estimation techniques such as Kalman filters can be considered as alternative and supplementary to traditional global optimization techniques used for structural determination and refinement. The Double-Iterated Kalman Filter (hereinafter referred to as DIKF) was implemented to generate a protein structure from distances derived from NOE data, with geometrical restraints being interpreted as additional data. This approach enables both structure determination and a definitive estimate of its uncertainty and thus provides significant additional knowledge and insight on those regions of the protein which correspond to non-unique conformations.
Kalman filter techniques provide important new features for interpreting the quality of a determined molecular structure. However, since the Kalman filter is an optimal, recursive filter based on the linearization of measurement non-linearities, a significant amount of information is lost when the filter is applied to complicated, nonlinear structures such as NOE intensities. In addition, large matrix data structures are required whose straightforward manipulation severely limit the computational efficiency of these techniques when applied to large macro-molecules. Consequently, Kalman filters, including the DIKF, do not significantly improve the efficiency of determination and refinement algorithms of prior art systems, in terms of solving the multiple-minima problem.
There is a need for an improved method of determining and refining a three dimensional molecular structure, which utilizes non-linear processing techniques to solve the multiple-minima problem and which is computationally efficient.
It is therefore an object of the invention to provide an improved method of determining and refining a three dimensional molecular structure, which utilizes non-linear processing techniques to solve the multiple-minima problem in a computationally efficient manner.
Other objects and advantages of the present invention will become apparent upon consideration of the appended drawings and description thereof.