The search for new and better drugs motivates the need for faster and more accurate methods of molecular structure determination. Pharmaceutical companies need to discover and commercialize new pharmaceuticals quickly and efficiently. While the need for new drugs has increased in recent years, the productivity of drug discovery and development has not markedly improved. Despite a more that ten-fold increase in research and development spending by the pharmaceutical and biotechnology industries between 1976 and 1994, the number of new molecules approved by the FDA for pharmaceutical use remained relatively constant in the range of 12-30 per year throughout this period. The time spent on discovery, development, and commercialization of new pharmaceutical molecules has remained constant at 10-12 years per drug. There is a need to shorten the time and reduce the cost of drug discovery.
To understand better the problems in the drug discovery, consider the current state of the art. Evidence from molecular biology points to the interaction of proteins with their targets as a fundamental biological mechanism underlying the normal state and numerous diseased states. For example, artheriosclerosis (the principal cause of heart attack and stroke) and cancer, responsible for greater than 50% of the mortality in the U.S., are triggered by the disordering of specific protein-protein recognition events. Early approaches to drug design depended on the chance observation of biological effects of a known compound or the screening of large numbers of exotic compounds, usually derived from natural sources, for any biological effects. The nature of the actual protein target was usually unknown. Characterizing diseases by specific molecular recognition events involved presents opportunities for more rational drug discovery and development.
To date, however, rational drug design has met with only limited success. One approach to rational drug design is based on first determining the entire structure of the proteins involved, examining the structure for possible targets, and then predicting the structure of drug molecules likely to bind to the possible target. The physical structure in the drug molecule that binds to the target region on the protein or other ligand is called a pharmacophore. Thus, the location of thousands of atoms in the protein must be accurately determined before beginning the drug design process. Recent structural studies on the potent anti-cancer drug 1843U89, an inhibitor of thymidylate synthetase (TS), indicate that the structure of the active site of TS is severely distorted upon binding of the drug (Weichsel et al., 1995, Nature Structural Biology 2:1095-1101; Stout et al., 1996, Structure 4:67-77). As this example indicates, even if the structure of the protein target TS was accurately characterized, rational drug design based on characterization of the targets would miss important potential drugs.
A more promising approach to rational drug design is based on diversity libraries, which are huge libraries of related molecules used to explore protein-target interactions (Clackson et al., 1994, Tibtech 12:173-184). From such a library only those members binding to the target of interest are selected. Two different promising libraries are those that consist of small cyclic peptides, with 6 to 12 amino acids, and those that consist of small organic molecules. Methods are now available to create such libraries and to select library members that recognize a specific protein target (Goldman et al., 1992, BioTechnology 10:1557-1561). Once a diversity library is created, it is a source from which to select specific members that bind to targets of interest. Molecular biological methods can be used to identify in a matter of days either single molecules or small ensembles of molecules from these huge libraries that bind to targets with high affinity and specificity. For example, researchers at Arris Pharmaceutical (Giebel et al., 1995, Biochemistry 34:15430-15435; Katz, B. A., 1995, Biochemistry 34:15421-15429) and SmithKline Beecham Pharmaceuticals (Zhao et al., 1995, Nature Structure Biology 2:1131-1137), have screened diversity libraries to select binders for streptavidin (Giebel et al., 1995, Biochemistry 34:15430-15435). They found several high-affinity cyclic peptides. Further, the researchers postulate that upon determination of the peptide structures with crystallography, the high-affinity peptide ligands will lead to small organic ligands (Katz, B. A., 1995, Biochemistry 34:15421-15429).
Finding pharmaceutically promising small organics from molecules selected in a diversity-library screen involves determination of the structure followed by comparison against known structures of small organics through data base searching. The step of finding small organics from an X-ray crystallographic structure by database search was done with the fibrinogen inhibitor REI-RGD34 (Zhao et al., 1995, Nature Structure Biology 2:1131-1137). Of the molecules returned from their database search, 11% were known binders to the target peptide. This search was based on a crystal structure of the inhibitor determined to 1 .ANG.. As these examples indicate, the diversity methods can select molecules from a screen, and those molecules can be compared to suitable pharmaceutical leads if their molecular structure is determined with sufficient accuracy.
Translating the structure of a molecule selected from a screen to a small organic lead compound relies upon an accurate characterization of the molecular structure of the selected molecule. Experimental structure determination currently relies primarily upon X-ray crystallography and solution-state NMR (MacArthur et al., 1994, Trends in Biotechnology 12:149-153). X-ray crystallography depends on the interaction of electron clouds with X-rays to provide information on the location of every heavy atom in a crystal of interest. The accuracy of X-ray crystallography is 0.5-2.0 .ANG. (1 .ANG.=10.sup.-8 cm). One of its principal drawbacks is the difficulty in obtaining the highly regular crystals necessary to achieve high resolution diffraction patterns. Although the art of crystal growing has improved, it remains insufficient. Many classes of biologically relevant proteins, for example transmembrane receptors, are highly resistant to crystallization. The difficulty is in part due to the existence of large hydrophilic and hydrophobic structural regions. Although an advance, co-crystallization of pairs of interacting species can still be difficult. Additional difficulties are the expense and time associated with obtaining high-quality crystals (MacArthur et al., 1994, Trends in Biotechnology 12:149-153), and uncertainty in whether or not the structures of the crystalline forms are representative of the in vivo conformations (Clore et al., 1994, Protein Science 3:372-390). The same problems arise when the target molecule is DNA or RNA.
Solution-state NMR, the second primary method, relies upon correlations between nuclear spins resulting from dipole-dipole interactions indirectly mediated by the electron clouds. High-resolution, multidimensional, solution-state NMR techniques are an attractive alternative to crystallography since that they can be applied in situ (i.e., in aqueous environment) to the study of small protein domains (Yu et al., 1994, Cell 76:933-945). Solution-state NMR has been successful at determining the structure of moderate-sized proteins and protein/ligand complexes to a 2 .ANG. resolution, similar to that of X-ray methods (MacArthur et al., 1994, Trends in Biotechnology 12:149-153; Clore et al., 1994, Protein Science 3:372-390). The structure of a ligand (such as a pharmaceutical lead compound) bound to a protein is most efficiently determined when the protein is uniformly .sup.13 C and .sup.15 N labeled (at increased cost), and the binding occurs in the slow exchange limit (Clore et al., 1994, Protein Science 3:372-390). In this limit, a bound complex remains together long enough for resonances of the free and bound form of the ligand to be resolved. The slow exchange limit restricts the applicability of the technique to only those ligands that exhibit tight binding. Also, complex spectra makes the analysis of the mutual correlations quite time consuming. Due to resolution problems resulting from increasingly complicated spectra and overlapping resonance lines, solution-state NMR has strict size limitations on the molecules that can be studied. The maximum size of a protein that is amenable to study is approximately 25 kDaltons (kD) (Clore et al., 1994, Protein Science 3:372-390). All of the above concerns and problems apply as well when the target molecules are DNA or RNA. In fact, a lack of resolution, along with a dearth of mutual correlations, limit the size of DNA and RNA molecules to less than 40 total bases.
Resolution obtainable by either crystallography or solution-state NMR has been inadequate for effective rational drug design, especially for the selection of a lead compound from databases of organic compound. The resolution required to achieve both drug affinity and drug specificity, although not precisely known, is probably measured in fractions of an .ANG., or even 0.1 .ANG. (MacArthur et al., 1994, Trends in Biotechnology 12:149-153). This accuracy appears to be beyond the capabilities of X-ray and solution-state NMR methods.
Improved methods of molecular structural determination, especially those applicable to drug discovery, have great utility. One such method that avoids some limitations associated with X-ray crystallography and solution-state NMR, and has significant advantages, is solid-state NMR, particularly dipolar-dephasing experiments such as rotational echo double resonance (REDOR) (Gullion et al., 1989, Journal of Magnetic Resonance 81:196-200; Gullion et al., 1989, Advances in Magnetic Resonance 13:57-83). Compared with crystallography, solid-state NMR has the advantage that it obtains high-resolution structural information from polycrystalline and disordered materials. This eliminates the need for the formation of highly regular crystals to achieve high resolution diffraction, and eliminates structural perturbations due to crystal packing forces. In contrast to solution-state NMR, which relies upon mutual correlations between nuclei from the indirect dipolar coupling (studied via the Nuclear Overhauser effect, NOE) that fall off as 1/r.sup.6, solid-state NMR relies upon the direct dipolar coupling, which decreases as 1/r.sup.3, for the measurement of internuclear distances, where r is the internuclear distance. As a result, longer distances can be measured with solid-state NMR, and the distances measured have a higher degree of accuracy and precision. Furthermore, solid-state NMR is not strictly limited by the size of the complex resulting from the drug bound to a target molecule. In the solid-state NMR experiments, the size limitations are determined primarily by the quantity of the sample available, and the sensitivity of the NMR spectrometer.
One advantage of the REDOR transform technique over solution-state NMR measurement is the direct and accurate determination of the internuclear distance from a measured frequency. Solution-state NMR experiments rely upon the indirect measurement of the dipolar coupling for distance measurements. In solution-state NMR there is no direct relationship between an experimentally measured parameter and the distance. Instead, the strength of the coupling, as inferred from the Nuclear Overhauser effect (NOE), is related to a range of possible distances spanning a few Angstroms.
For clarity, and to fit the experimental example presented in this invention, the discussion here will be limited to REDOR, although it can be generalized to other dipolar-dephasing (dipolar-recoupling) methods such as TEDOR (Hing et al., 1993, Journal of Magnetic Resonance, Series A 103:151-162; Hing et al., 1992, Journal of Magnetic Resonance 96:205-209), DRAMA (Tycko et al., 1990, Chemical Physics Letters 173:461-465; Tycko et al., 1993, Journal of Chemical Physics 98:932-943), DRAWS (Gregory et al., In 36th Experimental Nuclear Magnetic Resonance Conference; Boston, Mass., 1995; p 289), and MELODRAMA (Sun et al., 1995, Journal of Chemical Physics 102:702-707). REDOR is a high-resolution solid-state NMR technique for measuring the distance between heteronuclei in varied solid state materials. As applied to biological materials, this has primarily meant the distance between one .sup.13 C atom and one .sup.15 N atom (Marshall et al., 1990, Journal of the American Chemical Society 112:963-966; Garbow et al., 1993, Journal of the American Chemical Society 115:238-244). Because of the nature of nuclear magnetic interactions in the solid state, REDOR has the inherent ability to measure internuclear distances with a high degree of accuracy and precision. REDOR measurements are accurate to better than 0.05 .ANG. when the .sup.13 C-.sup.15 N distances are from 0 to 4 .ANG., and to better than 0.1 .ANG. when the .sup.13 C-.sup.15 N distances are from 4 to 6.ANG..
REDOR data has an important drawback that has limited its applicability and utility--conventional analysis and processing allows the measurement of one distance at a time from nuclear spins with degenerate chemical shifts. The reason for this drawback is that conventional REDOR data analysis relies upon a simplistic numerical calculation of the powder average of the disordered material. This analysis results in a universal curve (an example is shown in FIG. 2), having a shape related to the internuclear distance. Conventional analysis methods cannot separate overlapping universal curves. Therefore, in the conventional method of analysis, the simultaneous measurement of more than one distance from spins with degenerate chemical shifts is impractical.
A second area for improvement in dipolar-dephasing experiments is the elimination of the signal from uncoupled natural abundance nuclei that introduce complications into the processing of the data. One experiment that eliminates this natural abundance contribution is TEDOR. The analysis of TEDOR data, however, is cumbersome and limits its applicability. A method combining the ease of analysis of REDOR and the low background contribution of TEDOR would have clear utility.
A limitation with REDOR as it is currently practiced is the slow nature of the information gathering--usually only one distance at a time is measured between specifically labeled spins. Measurement of a number of distances, for instance at an active binding site in a molecule or a pharmacophore in a drug, is then both time consuming and expensive. While distance measurements obtained by this method are accurate and precise, they require distances to be measured, one at a time, in a slow, tedious fashion. Garbow and Gullion (Garbow et al., 1991, Journal of Magnetic Resonance 95:442-445) have shown that these burdens can be reduced by the measurement of REDOR signals from chemically shifted nuclei. However, as the size of a molecule increases, this strategy is limited by degenerate chemical shifts.
In a conventional two-dimensional NMR experiment, both dimensions in the time-domain data set are functions of sines and cosines. A two-dimensional frequency-domain spectrum is generated by Fourier transformation of both dimensions. In the solid-state NMR dipolar-dephasing experiments, only the second dimension is a function of sines and cosines; the first dimension is a function of cylindrical Bessel functions of fractional order. Thus, the standard method of Fourier transforming both time-domain dimensions to get a two-dimensional frequency domain spectrum is inappropriate for the dipolar-dephasing method. Previous two-dimensional solid-state NMR experiments (employing a Fourier transform in both dimensions) have measured the dipolar coupling versus the chemical shift, but only for strong dipolar couplings. These techniques are suitable only for strong dipolar couplings and are not suitable for the weak dipolar couplings measured in the dipolar-dephasing experiments.
Although the experiment of van Eck and Veeman (van Eck et al., 1994, Journal of Magnetic Resonance, Series A 109:250-252) is a three-dimensional experiment, they analyzed the data by Fourier transformation of the TEDOR dimension. Thus, their results lack the high-resolution and straight-forward interpretation inherent in the three-dimensional method put forth in this invention.
Whereas there is clear value in information relating to molecular structure, whereas there is utility in methods and devices relating to the generation of such information, whereas solid-state NMR is a method of great utility for generating such information in the form of precise distances for selected pairs of nuclei, whereas the state-of-the-art methods for analyzing the signals generated by solid-state NMR experiments all have certain limitations, there is a need for a new type of analysis method that is capable of producing high-precision information from experimental time-domain data in the form of frequency-domain spectra, that can produce high-precision distance information from the time-domain data, that suppresses noise in the time-domain experimental data, that provides an internal consistency check on generated spectra, that can separate contributions from several dipolar couplings in a single time-domain signal, that accepts time-domain data in which the signal has not decayed to zero at the final point, that does not require smoothing of time-domain data, with subsequent line broadening, that is built naturally on a representation of data collected at discrete points in time, rather than continuously over a certain period, that is able to handle naturally the noise from natural abundance signals, that, in summary, provides the best possible frequency-domain and distance-domain spectra from a time-domain signal according to a precise, information theoretic definition.
Citation of references herein and above shall not be construed as an admission that such reference is prior art to the present invention.