For many years, there has been intense interest in determining the three-dimensional structures of biological macromolecules, particularly proteins. So called "structure-function" studies have been carried out to determine the structural features of a molecule, or class of molecules, that are important for biological activity. Since the pioneering work of Perutz and coworkers on the structure of hemoglobin (Perutz, M. F. et al., Nature, 185:416-22 (1960)) and that of Watson and Crick on DNA in the 1950's (Watson, J. D. and Crick, F. H. C., Nature, 171:737 (1953), both of which led to the respective scientists receiving the Nobel Prize, this field has been of major importance in the biological sciences.
More recently, the concept of "rational drug design" has evolved. This strategy for the design of drugs involves determining the three-dimensional structure of an "active part" of a particular biological molecule, such as a protein. Knowing the three-dimensional structure of the active part can enable scientists to design a synthetic analogue of the active part that will block, mimic or enhance the natural biological activity of the molecule. (Appelt, K. et al., J. Med. Chem., 34:1925 (1991)). The biological molecule may, for example, be a receptor, an enzyme, a hormone, or other biologically active molecule. Determining the three-dimensional structures of biological molecules is, therefore, of great practical and commercial significance.
The first technique developed to determine three-dimensional structures was X-ray crystallography. The structures of hemoglobin and DNA were determined using this technique. In X-ray crystallography, a crystal (or fiber) of the material to be examined is bombarded with a beam of X-rays which are refracted by the atoms of the ordered molecules in the crystal. The scattered X-rays are captured on a photographic plate which is then developed using standard techniques. The diffracted X-rays are thus visualized as a series of spots on the plate and from this pattern, the structure of the molecules in the crystal can be determined. For larger molecules, it is frequently necessary to crystallize the material with a heavy ion, such as ruthenium, in order to remove ambiguity due to phase differences.
More recently, a second technique, nuclear magnetic resonance (NMR) spectroscopy, has been developed to determine the three-dimensional structures of biological molecules, particularly proteins. NMR was originally developed in the 1950's and has evolved into a powerful procedure to analyze the structure of small compounds such as those with a molecular weight of .ltoreq.1000 Daltons. Briefly, the technique involves placing the material to be examined (usually in a suitable solvent) in a powerful magnetic field and irradiating it with radio frequency (rf) electromagnetic radiation. The nuclei of the various atoms will align themselves with the magnetic field until energized by the rf radiation. They then absorb this resonant energy and re-radiate it at a frequency dependent on i) the type of nucleus and ii) its atomic environment. Moreover, resonant energy can be passed from one nucleus to another, either through bonds or through three-dimensional space, thus giving information about the environment of a particular nucleus and nuclei in its vicinity.
However, it is important to recognize that not all nuclei are NMR active. Indeed, not all isotopes of the same element are active. For example, whereas "ordinary" hydrogen, .sup.1 H, is NMR active, heavy hydrogen (deuterium), .sup.2 H, is not active in the same way. Thus, any material that normally contains .sup.1 H hydrogen can be rendered "invisible" in the hydrogen NMR spectrum by replacing all the .sup.1 H hydrogens with .sup.2 H. It is for this reason that NMR spectroscopic analyses of water-soluble materials frequently are performed in .sup.2 H.sub.2 O to eliminate the water signal.
Conversely, "ordinary" carbon, .sup.12 C, is NMR inactive whereas the stable isotope, .sup.13 C, present to about 1% of total carbon in nature, is active. Similarly, while "ordinary" nitrogen, .sup.14 N, is nmr active, it has undesirable properties for NMR and resonates at a different frequency from the stable isotope .sup.15 N, present to about 0.4% of total nitrogen in nature. For small molecules, these low level natural abundances were sufficient to generate the required experimental information, provided that the experiment was conducted with sufficient quantities of material and for a sufficient time.
As advances in hardware and software were made, the size of molecules that could be analyzed by these techniques increased to about 10 kD, the size of a small protein. Thus, the application of NMR spectroscopy to protein structural determinations began only a few years ago. It was quickly realized that this size limit could be raised by substituting the NMR inactive isotopes .sup.14 N and .sup.12 C in the protein with the NMR active stable isotopes .sup.15 N and .sup.13 C.
Over the past few years, labeling proteins with .sup.15 N and .sup.15 N/.sup.13 C has raised the analytical molecular size limit to approximately 15 kD and 40 kD, respectively. More recently, partial deuteration of the protein in addition to .sup.13 C- and .sup.15 N-labeling has increased the size of proteins and protein complexes still further, to approximately 60-70 kD. See Shan et al., J. Am. Chem.Soc., 118:6570-6579 (1996) and references cited therein.
Isotopic substitution is usually accomplished by growing a bacterium or yeast, transformed by genetic engineering to produce the protein of choice, in a growth medium containing .sup.13 C-, .sup.15 N- and/or .sup.2 H-labeled substrates. In practice, bacterial growth media usually consist of .sup.13 C-labeled glucose and/or .sup.15 N-labeled ammonium salts dissolved in D.sub.2 O where necessary. Kay, L. et al., Science, 249:411 (1990) and references therein and Bax, A., J. Am. Chem. Soc., 115, 4369 (1993). More recently, isotopically labeled media especially adapted for the labeling of bacterially produced macromolecules have been described. See U.S. Pat. No. 5,324,658.
The goal of these methods has been to achieve universal and/or random isotopic enrichment of all of the amino acids of the protein. By contrast, some workers have described methods whereby certain residues can be relatively enriched in .sup.1 H, .sup.2 H, .sup.13 C and .sup.15 N. For example, Kay et al., J. Mol. Biol., 263, 627-636 (1996) and Kay et al., J. Am. Chem. Soc., 119, 7599-7600 (1997) have described methods whereby isoleucine, alanine, valine and leucine residues in a protein may be labeled with .sup.2 H, 13C and .sup.15 N, but specifically labeled with .sup.1 H at the terminal methyl position. In this way, study of the proton-proton interactions between some of the hydrophobic amino acids may be facilitated. Similarly, a cell-free system has been described by Yokoyama et al., J. Biomol. NMR, 6(2), 129-134 (1995)., wherein a transcription-translation system derived from E. coli was used to express human Ha-Ras protein incorporating .sup.15 N serine and/or aspartic acid.
These methods are important, in that they provide additional means for interpreting the complex spectra obtained from proteins. However, it should be noted that the Kay et al. methods are limited to the aliphatic amino acids described above. By contrast, the method described by Yokoyama will facilitate the selective enrichment of any amino acid, but is limited to those proteins that can be expressed in a cell-free system. Glycoproteins, for example, may not be expressed in this system.
Techniques for producing isotopically labeled proteins and macromolecules, such as glycoproteins, in mammalian or insect cells have been described. See U.S. Pat. Nos. 5,393,669 and 5,627,044; Weller, C. T., Biochem., 35, 8815-23 (1996) and Lustbader, J. W., J.Biomol. NMR, 7, 295-304 (1996). Weller et al. applied these techniques to the determination of the structure of a glycoprotein including its glycosyl sidechain.
While the above techniques represent remarkable advances in this field, they each suffer from certain disadvantages. For example, all are time-consuming. In X-ray crystallographic methods, crystals can take years to form before the experiment even starts. In NMR spectroscopy, although the protein sample may be used immediately in the NMR experiment, processing the data obtained, i.e., analyzing which signal comes from which set of which atoms (the "assignments"), may also take years. Modern drug discovery research depends heavily on knowledge of the structures of biologically active macromolecules. This research would benefit substantially from enhancements in the capabilities and speed of three-dimensional structural analyses of proteins and other macromolecules.
In the past few years, growth in discovering alternative, rapid methods for the identification of candidate drugs has occurred. Genomic techniques, using rapid DNA sequencing methods and computer assisted homology identification, have enabled the rapid identification of target proteins as potential drug candidates. O'Brien, C., Nature, 385 (6616):472 (1997). Once identified, a target protein can be quickly produced using modern recombinant technology. Combinatorial chemistry, wherein large numbers of chemical compounds are simultaneously synthesized on plastic plates, frequently by robots, has revolutionized the synthesis of drug candidates, with tens of thousands of compounds ("libraries") able to be synthesized in a few months. See Gordon, F. M. et al., J. Mol. Chem., 37(10), 1385-1401 (1994). The library is then "screened" by allowing each member of the library to come into contact with the target protein. Those that bind are identified, and similar compounds are synthesized and screened. The whole process continues in an iterative manner until a drug candidate of suitably high binding affinity has been identified. One variation of this screening strategy has recently been published by Fesik et al., Science, 274, 1531-34 (1996), wherein the screening of the libraries takes place using NMR against an isotopically labeled protein and the binding is detected from perturbations in the NMR spectrum.
Prior knowledge of the three-dimensional structure of a target protein can enable the design of a "focused" combinatorial library, thereby increasing the likelihood of finding potential drug candidates that interact with the biological molecule of interest. However, whereas genomic and combinatorial chemistry each can be performed in months, known methods for protein structural determinations usually take much longer. Therefore, there is a need for methods to increase the speed with which high resolution structures of proteins, including those that are the targets of potential drug candidates, may be determined.