This invention relates to methods for determining the multi-dimensional topology of a system, and in particular, to methods for identifying critical points in the internal multi-dimensional topology of a system within a space.
Complex data sets such as scalar fields or relative density maps exist in virtually every area of science. Fluid dynamics, stress analysis, quantum physics, physical chemistry, molecular biology, and geology are but a few examples. These data contain information relating to the multi-dimensional topology of a structure, such as a molecule, the spacial arrangement of ore veins within a geological sample, etc., as the case may be. However, the accuracy of the information extracted is limited by the strength of the analysis applied to the data.
As an example, consider the 3-dimensional topology of the molecular structure of proteins. Although elucidation of the molecular structure of proteins is a fundamental goal of research in molecular biology, only a small fraction of the currently known proteins have been fully characterized. Crystallography plays a major role in current efforts to characterize and understand molecular structures and molecular recognition processes. Typically, a pure crystal of a protein residing in a volume under consideration is irradiated with X-rays to produce a diffraction pattern on a photographic film located on the opposite side of the crystal. Following a series of chemical and mathematical manipulations, a series of electron density maps are created, and these may be blended or combined to form an electron density map of a single protein molecule. The information derived from crystallographic studies provides a molecular scene, the starting point for analyses.
The determination of molecular structures from X-ray diffraction data is an exercise in image reconstruction from incomplete and/or noisy data. Molecular scene analysis is therefore concerned with the processes of reconstruction, classification and understanding of complex images. Such analyses rely on the ability to segment a representation of a molecule into its meaningful parts, and on the availability of a priori information, in the form of rules or structural templates, for interpreting the partitioned image.
A crystal consists of a regular (periodic) 3D arrangement of identical building blocks, termed the unit cell. A crystal structure is defined by the disposition of atoms and molecules within this fundamental repeating unit. A given structure can be solved by interpreting an electron density image of its unit cell content, generatedxe2x80x94using a Fourier transformxe2x80x94from the amplitudes and phases of experimentally derived diffraction data.
An electron density map is a 3D array of real values that estimate the electron density at given locations in the unit cell; this information gives access to the structure of a protein. Strictly speaking, the diffraction experiment provides information on the ensemble average over all of the unit cells. Unfortunately, only the diffraction amplitudes can be measured directly from a crystallographic experiment; the necessary phase information for constructing the electron density image must be obtained by other means. Current solutions to the phase problem for macromolecules rely on gathering extensive experimental data and on considerable input from experts during the image interpretation process. This is the classic phase problem of crystallography.
In contrast to small molecules (up to 150 or so independent, non-hydrogen atoms), the determination of protein structures (which often contain in excess of 3000 atoms) remains a complex task hindered by the phase problem. The initial electron density images obtained from crystallographic data for these macromolecules are typically incomplete and noisy. The interpretation of a protein image generally involves mental pattern recognition procedures where the image is segmented into features, which are then compared with anticipated structural motifs. Once a feature is identified, this partial structure information can be used to improve the phase estimates resulting in a refined (and eventually higher-resolution) image of the molecule. Despite recent advances in tools for molecular graphics and modeling, this iterative approach to image reconstruction is still a time consuming process requiring substantial expert intervention. In particular, it depends on an individual""s recall of existing structural patterns and on the individual""s ability to recognize the presence of these motifs in a noisy and complex 3D image representation.
It is an object of the present invention to provide methods for determining the multi-dimensional topology of a system, such as a system within a space.
According to a broad aspect of the invention there is provided a method of determining the multi-dimensional topology of a substance within a volume, the method comprising the steps of:
a) acquiring a set of relative density values for the volume, each value for a given location within the volume;
b) interpolating a set of functions to generate a continuous relative density for the volume;
c) identifying critical points of the continuous relative density by using an eigenvector following method; and
d) associating critical points with one another by following a gradient path of the continuous relative density between the critical points.
According to another aspect of the invention, there is provided a method of determining the multi-dimensional topology of a volume from a set of relative density values for the volume, each value for a given location within the volume, the method comprising the steps of:
a) interpolating a set of functions to generate a continuous relative density for the volume;
b) identifying critical points of the continuous relative density by using an eigenvector following method; and
c) associating critical points with one another by following a gradient path of the continuous relative density between the critical points.
The invention further provides a method of determining the multi-dimensional topology of a volume, having a continuous relative density for the volume generated from a set of functions interpolating a set of acquired relative density values for the volume, each value for a given location within the volume, the method comprising the steps of:
a) identifying critical points of the continuous relative density by using an eigenvector following method; and
b) associating critical points with one another by following a gradient path of the continuous relative density between the critical points.
The invention further provides a method of identifying critical points in the multi-dimensional topology of a substance within a volume, the method comprising the steps of:
a) acquiring a set of relative density values for the volume, each value for a given location within the volume;
b) interpolating a set of functions to generate a continuous relative density for the volume; and
c) identifying critical points of the continuous relative density by using an eigenvector following method.
According to a further aspect of the invention, a method is provide for determining the multi-dimensional topology of a system within a space, the method comprising the steps of:
a) acquiring a set of relative values for scalar properties of the space, each value for a given point within the space;
b) interpolating a set of functions to generate continuous relative values for the scalar properties;
c) identifying critical points of the continuous relative values by using an eigenvector following method; and
d) associating critical points with one another by following a gradient path of the continuous relative values between the critical points.
The invention also provides a method of determining the multi-dimensional topology of a system within a space from a set of relative values for scalar properties of the space, each value for a given point within the space, the method comprising the steps of:
a) interpolating a set of functions to generate continuous relative values for the scalar properties;
b) identifying critical points of the continuous relative values by using an eigenvector following method; and
c) associating critical points with one another by following a gradient path of the continuous relative values between the critical points.
The invention further provides a method of determining the multi-dimensional topology of a system within a space, having continuous relative values of scalar properties for the space generated from a set of functions interpolating a set of acquired relative values of the scalar properties, each value for a given location within the space, the method comprising the steps of:
a) identifying critical points of the continuous relative values by using an eigenvector following method; and
b) associating critical points with one another by following a gradient path of the continuous relative values between the critical points.
The invention further provides a method of identifying critical points in the multi-dimensional topology of a system within a space, the method comprising the steps of:
a) acquiring a set of relative values for properties of the space, each value for a given point within the space;
b) interpolating a set of functions to generate continuous relative values for the space; and
c) identifying critical points of the continuous relative values by using an eigenvector following method.
The methods of the invention also provide the ability to generate a representation of the topology according to the associated critical points.
According to the invention, in any of the above methods:
a) the substance may be a protein or a protein complex; and
b) the relative density values may be electron density values generated by x-ray diffraction through a crystal of the protein or protein complex.
According to the invention, the relative density can be the electron charge density resulting from the application of X-ray crystallography to proteins. The methods of the invention can also be applied to other systems that can be represented in terms of relative values of properties. For example, relative values of properties may be relative density or concentration of ore derived from core samples of mineral deposits in geological analysis. The samples provide the values of relative density or concentration at given points (in this case, locations) within the space (in this case, a volume) to be analyzed. The result of the application of the method of the invention in this case may be to find regions of low concentration (minima), high concentration (maxima), and other critical points of an ore body within the space; in order to define the ore body and/or its path.
As other examples, the systems may be the flow of a fluid over a surface, the reaction and/or folding of a protein or protein complex, an edge portion of a graphical image, one or more extremal points within a scalar field, or a trend or energy field within a set of data.
In further aspects, the invention provides separate means for carrying out each of the method aspects.