1. Field of the Invention
This invention relates to methods and systems for determining molecular structures using x-ray crystallography.
2. Description of Related Art
In x-ray diffraction crystallography, a crystalline form of the molecule under study is exposed to a beam of x-rays, and the intensity of diffracted radiation at a variety of angles from the angle of incidence is measured. The beam of x-rays is diffracted into a plurality of diffraction xe2x80x9creflections,xe2x80x9d each reflection representing a reciprocal lattice vector. From the diffraction intensities of the reflections, the magnitudes of a series of numbers, known as xe2x80x9cstructure factors,xe2x80x9d are determined. The structure factors in general are complex numbers, having a magnitude and a phase in the complex plane, and are defined by the electron distribution within the unit cell of the crystal.
The magnitudes of the complex numbers are relatively easy to experimentally determine from measured diffraction intensities of the various reflections. However, a map of electron density and/or atomic position within the unit cell of the crystal cannot be generated without determining the phases of the structure factors as well. Thus, the central problem in x-ray diffraction crystallography is the determination of phases for structure factors whose amplitudes are already known.
In attempts to determine the structure of large biomolecules such as proteins, one of the most frequently used approaches to solve this problem is based on isomorphous replacement. In single isomorphous replacement (SIR) analysis, one or more heavy atoms are attached to the protein, creating a heavy atom derivative or isomorph of the protein. An analysis of the difference between the x-ray diffraction intensities from the native protein and from its heavy atom derivative can limit the phase of at least some structure factors to two plausible possibilities. For each structure factor, this SIR analysis results in a phase probability distribution curve which is typically substantially bimodal, with peaks positioned at the two most probable phases for that structure factor.
To remove the ambiguity of which probability peak corresponds to the correct phase for each structure factor, a plurality of heavy atom derivatives can be used to generate a set of phase probability distribution curves for each structure factor. In this multiple isomorphous replacement (MIR) analysis, the probability distribution curves for a selected structure factor are mathematically combined such that the resulting phase value is consistent across all of the heavy atom derivatives for the selected structure factor. In essence, the resulting phase value common to the set of phase probability distribution curves corresponds to the correct phase of the structure factor. An alternative analysis, multiple anomalous diffraction (MAD) has mathematical formalisms which are similar to those of MIR analysis. Aspects of these two procedure are described in Section 8.4, pages 255-267, of An Introduction to X-Ray Crystallography by Michael M. Woolfson, Cambridge University Press (1970, 1997). The complete content of the Woolfson textbook is hereby incorporated by reference in its entirety.
The heavy atom derivative method is commonly used when the structure of the protein or other molecule(s) in the unit cell is wholly unknown. However, the preparation of heavy atom derivatives is slow and tedious, and the creation of a sufficient number of heavy atom isomorphs to sufficiently reduce the phase ambiguity is not always possible.
According to one aspect of the present invention, a method reduces the structure factor phase ambiguity corresponding to a selected reciprocal lattice vector. The method comprises generating an original phase probability distribution corresponding to a selected structure factor phase of the selected reciprocal lattice vector. The original phase probability distribution comprises a first structure factor phase ambiguity. The method further comprises combining the original phase probability distribution with a plurality of phase probability distributions of a plurality of structure factor phases of other reciprocal lattice vectors using a phase equation or inequality. The phase equation or inequality defines a mathematical relationship between the selected structure factor phase of the selected reciprocal lattice vector and the plurality of structure factor phases of other reciprocal lattice vectors. The method further comprises producing a resultant phase probability distribution for the selected structure factor phase of the selected reciprocal lattice vector. The resultant phase probability distribution comprises a second structure factor phase ambiguity which is smaller than the first structure factor phase ambiguity.
According to another aspect of the present invention, a method defines a structure factor phase for a reflection derived from x-ray crystallography data. The method comprises generating a first probability distribution for the structure factor phase of the reflection. The method further comprises generating two or more additional probability distributions for the structure factor phases of other reflections. The method further comprises calculating a composite probability distribution for the structure factor phase of the reflection. The composite probability distribution is derived from the first probability distribution of the reflection and the two or more additional probability distribution of the other reflections.
According to another aspect of the present invention, the methods described herein are implemented on computer readable medium having instructions stored thereon which causes a general purpose computer system to perform the methods described herein. According to another aspect of the present invention, a computer-implemented x-ray crystallography analysis system is programmed to perform the methods described herein.
According to another aspect of the present invention, a computer-implemented x-ray crystallography analysis system comprises a means for retreiving a first phase probability distribution corresponding to a selected structure factor phase of a selected reciprocal lattice vector. The system further comprises a means for retreiving a plurality of second phase probability distributions corresponding to other structure factor phases of other reciprocal lattice vectors. The system further comprises a means for combining the first phase probability distribution and plurality of second phase probability distributions so as to produce a resultant phase probability distribution for the selected structure factor phase of the selected reciprocal lattice vector.
According to another aspect of the present invention, a method refines x-ray diffraction data. The method comprises combining structure factor phase probability distributions for different reciprocal lattice vectors so that the structure factor phase probability distribution for at least one of the reciprocal lattice vectors is more heavily weighted toward a phase value.