This invention is concerned with determining the three-dimensional structure of biological macromolecules, especially proteins. In particular, it is concerned with methods for rapidly determining protein structures by NMR spectroscopy, by providing methods for simplifying NMR spectra using labeled proteins prepared from specifically isotopically labeled amino acids, and the means whereby these labeled proteins and amino acids may be obtained.
For many years, there has been intense interest in determining the three-dimensional structures of biological macromolecules, particularly proteins. So called xe2x80x9cstructure-functionxe2x80x9d studies have been carried out to determine the structural features of a molecule, or class of molecules, that are important for biological activity. Since the pioneering work of Perutz and coworkers on the structure of hemoglobin (Perez, M. F. et al., Nature, 185:416-22 (1960)) and that of Watson and Crick on DNA in the 1950""s (Watson, J. D. and Crick, F. H. C., Nature, 171:737 (1953), both of which led to the respective scientists receiving the Nobel Prize, this field has been of major importance in the biological sciences.
More recently, the concept of xe2x80x9crational drug designxe2x80x9d has evolved. This strategy for the design of drugs involves determining the three-dimensional structure of an xe2x80x9cactive partxe2x80x9d of a particular biological molecule,"" such as a protein. Knowing the three-dimensional structure of the active part can enable scientists to design a synthetic analogue of the active part that will block, mimic or enhance the natural biological activity of the molecule. (Appelt, K. et al., J. Med. Chem., 34:1925 (1991)). The biological molecule may, for example, be a receptor, an enzyme, a hormone, or other biologically active molecule. Determining the three-dimensional structures of biological molecules is, therefore, of great practical and commercial significance.
The first technique developed to determine three-dimensional structures was X-ray crystallography. The structures of hemoglobin and DNA were determined using this technique. In X-ray crystallography, a crystal (or fiber) of the material to be examined is bombarded with a beam of X-rays which are refracted by the atoms of the ordered molecules in the crystal. The scattered X-rays are captured on a photographic plate which is then developed using standard techniques. The diffracted X-rays are thus visualized as a series of spots on the plate and from this pattern, the structure of the molecules in the crystal can be determined. For larger molecules, it is frequently necessary to crystallize the material with a heavy ion, such as ruthenium, in order to remove ambiguity due to phase differences.
More recently, a second technique, nuclear magnetic resonance (NMR) spectroscopy, has been developed to determine the three-dimensional structures of biological molecules, particularly proteins. NMR was originally developed in the 1950""s and has evolved into a powerful procedure to analyze the structure of small compounds such as those with a molecular weight of xe2x89xa61000 Daltons. Briefly, the technique involves placing the material to be examined (usually in a suitable solvent) in a powerful magnetic field and irradiating it with radio frequency (rf) electromagnetic radiation. The nuclei of the various atoms will align themselves with the magnetic field until energized by the rf radiation. They then absorb this resonant energy and re-radiate it at a frequency dependent on i) the type of nucleus and ii) its atomic environment. Moreover, resonant energy can be passed from one nucleus to another, either through bonds or through three-dimensional space, thus giving information about the environment of a particular nucleus and nuclei in its vicinity.
However, it is important to recognize that not all nuclei are NMR active. Indeed, not all isotopes of the same element are active. For example, whereas xe2x80x9cordinaryxe2x80x9d hydrogen, 1H, is NMR active, heavy hydrogen (deuterium), 2H, is not active in the same way. Thus, any material that normally contains 1H hydrogen can be rendered xe2x80x9cinvisiblexe2x80x9d in the hydrogen NMR spectrum by replacing all the 1H hydrogens with 2H. It is for this reason that NMR spectroscopic analyses of water-soluble materials frequently are performed in 2H2O to eliminate the water signal.
Conversely, xe2x80x9cordinaryxe2x80x9d carbon, 12C, is NMR inactive whereas the stable isotope, 13C, present to about 1% of total carbon in nature, is active. Similarly, while xe2x80x9cordinaryxe2x80x9d nitrogen, 14N, is nmr active, it has undesirable properties for NMR and resonates at a different frequency from the stable isotope 15N, present to about 0.4% of total nitrogen in nature. For small molecules, these low level natural abundances were sufficient to generate the required experimental information, provided that the experiment was conducted with sufficient quantities of material and for a is sufficient time.
As advances in hardware and software were made, the size of molecules that could be analyzed by these techniques increased to about 10 kD, the size of a small protein. Thus, the application of NMR spectroscopy to protein structural determinations began only a few years ago. It was quickly realized that this size limit could be raised by substituting the NMR inactive isotopes 14N and 12C in the protein with the NMR active stable isotopes 15N and 13C.
Over the past few years, labeling proteins with 15N and 15N/13C has raised the analytical molecular size limit to approximately 15 kD and 40 kD, respectively. More recently, partial deuteration of the protein in addition to 13C- and 15N-labeling has increased the size of proteins and protein complexes still further, to approximately 60-70 kD. See Shan et al., J. Am. Chem.Soc., 118:6570-6579 (1996) and references cited therein.
Isotopic substitution is usually accomplished by growing a bacterium or yeast, transformed by genetic engineering to produce the protein of choice, in a growth medium containing 13C-, 15N- and/or 2H-labeled substrates. In practice, bacterial growth media usually consist of 13C-labeled glucose and/or 15N-labeled ammonium salts dissolved in D2O where necessary. Kay, L. et al., Science, 249:411 (1990) and references therein and Bax, A., J. Am. Chem. Soc., 115, 4369 (1993). More recently, isotopically labeled media especially adapted for the labeling of bacterially produced macromolecules have been described. See U.S. Pat. No. 5,324,658.
The goal of these methods has been to achieve universal and/or random isotopic enrichment of all of the amino acids of the protein. By contrast, some workers have described methods whereby certain residues can be relatively enriched in 1H, 2H, 13C and 15N. For example, Kay et al., J. Mol. Biol., 263, 627-636 (1996) and Kay et al., J. Am. Chem. Soc., 119, 7599-7600 (1997) have described methods whereby isoleucine, alanine, valine and leucine residues in a protein may be labeled with 2H, 13C and 15N, but specifically labeled with 1H at the terminal methyl position. In this way, study of the proton-proton interactions between some of the hydrophobic amino acids may be facilitated. Similarly, a cell-free system has been described by Yokoyama et al., J. Biomol. NMR, 6(2), 129-134 (1995)., wherein a transcription-translation system derived from E. coli was used to express human Ha-Ras protein incorporating 15N serine and/or aspartic acid.
These methods are important, in that they provide additional means for interpreting the complex spectra obtained from proteins. However, it should be noted that the Kay et al. methods are limited to the aliphatic amino acids described above. By contrast, the method described by Yokoyama will facilitate the selective enrichment of any amino acid, but is limited to those proteins that can be expressed in a cell-free system. Glycoproteins, for example, may not be expressed in this system.
Techniques for producing isotopically labeled proteins and macromolecules, such as glycoproteins, in mammalian or insect cells have been described. See U.S. Pat. Nos. 5,393,669 and 5,627,044; Weller, C. T., Biochem., 35, 8815-23 (1996) and Lustbader, J. W., J.Biomol. NMR, 7, 295-304 (1996). Weller et al. applied these techniques to the determination of the structure of a glycoprotein including its glycosyl sidechain.
While the above techniques represent remarkable advances in this field, they each suffer from certain disadvantages. For example, all are time-consuming. In X-ray crystallographic methods, crystals can take years to form before the experiment even starts. In NMR spectroscopy, although the protein sample may be used immediately in the NMR experiment, processing the data obtained, i.e., analyzing which signal comes from which set of which atoms (the xe2x80x9cassignmentsxe2x80x9d), may also take years. Modern drug discovery research depends heavily on knowledge of the structures of biologically active macromolecules. This research would benefit substantially from enhancements in the capabilities and speed of three-dimensional structural analyses of proteins and other macromolecules.
In the past few years, growth in discovering alternative, rapid methods for the identification of candidate drugs has occurred. Genomic techniques, using rapid DNA sequencing methods and computer assisted homology identification, have enabled the rapid identification of target proteins as potential drug candidates. O""Brien, C., Nature, 385 (6616):472 (1997). Once identified, a target protein can be quickly produced using modern recombinant technology. Combinatorial chemistry, wherein large numbers of chemical compounds are simultaneously synthesized on plastic plates, frequently by robots, has revolutionized the synthesis of drug candidates, with tens of thousands of compounds (xe2x80x9clibrariesxe2x80x9d) able to be synthesized in a few months. See Gordon, F. M. et al., J. Mol. Chem., 37(10), 1385-1401 (1994). The library is then xe2x80x9cscreenedxe2x80x9d by allowing each member of the library to come into contact with the target protein. Those that bind are identified, and similar compounds are synthesized and screened. The whole process continues in an iterative manner until a drug candidate of suitably high binding affinity has been identified. One variation of this screening strategy has recently been published by Fesik et al., Science, 274, 1531-34 (1996), wherein the screening of the libraries takes place using NMR against an isotopically labeled protein and the binding is detected from perturbations in the NMR spectrum.
Prior knowledge of the three-dimensional structure of a target protein can enable the design of a xe2x80x9cfocusedxe2x80x9d combinatorial library, thereby increasing the likelihood of finding potential drug candidates that interact with the biological molecule of interest. However, whereas genomic and combinatorial chemistry each can be performed in months, known methods for protein structural determinations usually take much longer. Therefore, there is a need for methods to increase the speed with which high resolution structures of proteins, including those that are the targets of potential drug candidates, may be determined.
The present invention provides novel labeled proteins that are isotopically labeled in the backbone structure, but not in the amino acid side chains. The invention also provides novel cell culture media that contain one or more amino acids isotopically labeled in the backbone structure but not in the side chain, and methods for making a labeled protein by cultivating a protein-producing cell culture on such a culture medium.
In another aspect, the invention provides a method for determining the three-dimensional structure of a protein wherein at least one of the amino acids in the protein is specifically labeled in its backbone but not its side chain with any combination of the NMR isotopes 2H, 13C and 15N.
In yet another aspect of the present invention, a method is provided for rapidly assigning the signals in the NMR spectrum of a protein wherein at least one of the amino acids in the protein is specifically labeled in its backbone, but not its side chain with any combination of the NMR isotopes 2H, 13C and 15N.
In preferred embodiments of these various aspects of the invention, the amino acids contained in the culture media and incorporated into the protein structure are labeled in the backbone with 13C and 15N and optionally with 2H.
The invention provides a means for rapidly determining the three-dimensional structure of proteins by NMR. As described in further detail below, this improvement in NMR spectroscopic techniques is accomplished by i) increasing the resolution of key signals in the NMR spectrum and ii) eliminating the splitting of the key signals by an adjacent NMR active nucleus. These effects are accomplished by specifically isotopically labeling at least one of the amino acids utilized in the synthesis of the protein with only those atoms that the analyst wishes to detect in the NMR spectrum, so that all other atoms, including those adjacent to the key nuclei, are unlabeled. This approach is a departure from current NMR labeling techniques wherein the goal has been to prepare proteins in a universally labeled form.
Proteins containing specifically labeled amino acids can be chemically synthesized or expressed by bacteria, yeast or mammalian or insect cells or in cell-free systems, as described by Yokoyama et al. The labeled proteins preferably comprise at least about 50 amino acid residues. The compositions and methods of the invention may advantageously be employed in connection with proteins having molecular masses of at least about 5 kD.
If bacterial or yeast expression is desired, then the medium should contain all of the amino acids necessary for protein biosynthesis in the desired specifically labeled form to prevent non-specific labeling. Notwithstanding the provisions of substantially all amino acids in specifically labeled form, isotope shuffling may still occur with bacteria or yeast grown in such a medium. Accordingly, proteins containing specifically isotopically labeled amino acids are preferably expressed either in a cell-free system or in mammalian or insect cells grown in a medium containing the amino acids required for protein biosynthesis. It is well known that nearly all naturally occurring amino acids cannot be synthesized by mammalian or insect cells, therefore, isotope shuffling will be at a minimum in such cells. The amino acid compositions for insect and mammalian cell culture media are well known. Such media are described in U.S. Pat. Nos. 5,393,669 and 5,627,044, the disclosures of which are incorporated herein by reference. Generally, all twenty essential amino acids are present in such media, and in accordance with the present invention, any or all such amino acids may be specifically isotopically labeled.
The labeled amino acids of the target protein are labeled at specific positions with any combination of the NMR isotopes 2H, 13C and 15N, such that only those atoms desired to be detectable in the spectrum are NMR active. It will be recognized by those skilled in the art that a key set of identifications required in elucidating protein structure by NMR is obtained from the assignment of signals from the backbone of the protein, i.e., in the signals between the xcex1-carbon of a given amino acid and the amino protons of the same and adjacent residues in the protein sequence. Grzesiek, S. and Bax, A. J., J. Magn. Reson., vol. 96:432-440 (1992). In the Grzesiek et al. experiment (the xe2x80x9cHNCA experimentxe2x80x9d), less than optimal sensitivity and resolution were achieved due to the influence of neighboring atoms whose presence is not essential for background structural assignments, but which nevertheless were detected due to the universal labeling strategies employed. These complications are reduced by employing only specifically labeled amino acids in accordance with this invention.
In the instant invention, the amino acids of the target protein are advantageously labeled at the xcex1-amino group with 15N and at the C-carbonyl and the xcex1-carbon with 13C, while-the side chains are left unlabeled. In this way, the signals from the C-carbonyl and the xcex1-carbon are uncoupled from each other using conventional NMR techniques. Importantly, the signal from the xcex1-carbon is not split into two parts by the adjacent xcex2-carbon atom when that carbon is in the inactive, 12C form. This approach contrasts with the method described by Matsuo et al., J. Magn. Reson., 113,91-96 (1996), which uses a selective radio-frequency field to decouple the xcex2carbon resonances. This method lacks generality, particularly, for serine residues, where the xcex1-carbon and the xcex2-carbon resonances are insufficiently resolved.
In a particularly preferred aspect of the invention, all of the amino acids of the target protein are not only labeled at the xcex1-amino group with 15N and at the C-carbonyl and the xcex1-carbon with 13C, but are also deuterated at the xcex1-carbon, the side chains being left unlabeled. In this way, in addition to the above advantages, the linewidth of the signals from each xcex1-carbon is significantly narrowed because the carbon nucleus is no longer efficiently relaxed by an attached proton. This decrease in linewidth significantly increases the resolution of the distinct signals from each amino acid residue (Grzesiek et al., J. Am. Chem. Soc., 115, 4369-4370 (1993)).
In a further preferred aspect of this invention, all of the amino acids of the target protein are not only labeled at the a-amino group with 15N and at the C-carbonyl and the xcex1-carbon with 13C, but are also partially protonated at the xcex1-carbon. This approach preserves the advantage of line-narrowing mentioned in the previous paragraph, as well as permits the application of experiments that involve protonation at the xcex1-carbon. These experiments include those described for determining long-range structure in macromolecules, which experiments exploit the presence of residual dipolar couplings between atoms such as 13C and 1H in dilute liquid crystalline solutions. (Tjandra and Bax, Science 278, 1111-1114 (1997)) The angular information derived from these experiments may be used for determining the structures of large proteins ( greater than 40 kDa). The present invention thus may be used in connection with these experiments to restrict the dipolar coupling information to Nxe2x80x94H and Cxcex1-H spin pairs, which greatly simplifies the relevant NMR spectra.
In this preferred aspect of the invention, the amino acids are deuterated at the xcex1-carbon to a level of about 30-70% in a preferred embodiment, about 40-60% in a more preferred embodiment, and about 50% in a most preferred embodiment.
Amino acids have been chemically synthesized in unlabeled forms by various means, and some have been synthesized in specifically isotopically labeled forms. See, e.g., Martin, Isotopes Environ. Health Stud., 32:15 (1996); Schmidt, Isotopes Environ. Health Stud., 31:161 (1995). Ragnarsson et al., J. Chem. Soc. Perkin Trans 1, 2503 (1994) synthesized BOC labeled forms of the following amino acids: 1,2-13C2, 15N Ala, Phe, Leu, and Tyr; 1,2-13C2, 3,3,3-2H3, 15N Ala; 1,2-13C2, 3,3-2H2, 15N Phe; 3,3,3-2H3 Ala. Ragnarsson, J. Chem. Soc. Chem. Commun., 935 (1996) also synthesized BOC labeled 1,2-13C2, 2-2H, 15N Ala, Leu and Phe; and 1,2-13C2, 2,2-2H2, 15N Gly which were partly used for conformational studies of the pentapeptide, Leu-Enkephalin (Biopolymers, 41:591 (1997)). Unkefer (J. Lab. Cpd. Radiopharm., 38:239 (1996)) synthesized 15N labeled Ala, Val, Leu, and Phe as well as 1-13C, 15N Val. However, as noted above, mammalian cell media require all twenty amino acids for cell growth. In accordance with the present invention, methods for synthesizing all twenty amino acids in specifically labeled form and culture media containing all or any combination of such amino acids are provided.
Specifically isotopically labeled amino acids may be synthesized by asymmetric synthesis from an appropriately isotopically labeled precursor. Glycine, specifically labeled with any combination of 13C and 15N, is readily available commercially. Preferably, therefore, the amino acids are synthesized using glycine, isotopically labeled as required, as a precursor.
Methods for synthesizing amino acids from glycine have been described which may be used in accordance with the present invention (Duthaler, Tetrahedron, 50:1539 (1994); Schxc3x6llkopf, Topics Curr. Chem., 109:65 (1983); Oppolzer, Tett. Letts., 30:6009 (1989); Helvetica Chimica Acta, 77:2363 (1994); Helvetica Chimica Acta, 75:1965 (1992)).
In one aspect of the invention, 13C2, 15N-glycine is first esterified with a suitable alcohol, such as methanol, ethanol or isopropanol, to give the corresponding ester. 
The amino group of the glycine ester may be protected by procedures known in the art. See Green, Protective Groups in Organic Synthesis, Wiley, N.Y. (1991). Schiff bases (Stork, J. Org. Chem., 41:3491 (1976)) are preferred for protection with the diphenyl ketimine (O""Donnell, J. Org. Chem., 47:2663 (1982)) or bis(methylsulfanyl) imine (Hoppe, Liebigs Ann. Chem., 1979, 2066) being particularly preferred. Introducing the protecting group may be accomplished by reacting the glycine ester with the corresponding aryl imine for the diphenyl ketimine protecting group, or by reacting the glycine ester with carbon disulfide and methyl iodide for the bis(methylsulfanyl)imine protecting group. 
As described above, in a particularly preferred aspect of the invention, the amino acids in the expression medium are deuterated at the xcex1-carbon. If deuterated amino acids are required, then the doubly protected glycine derivative obtained above is deuterated at the xcex1-carbon by treating it with a base in a deuteronic solvent, such as sodium carbonate in D2O (Ragnarsson, J. Chem. Soc. Chem. Commun., 935 (1996)). To minimize loss of material due to hydrolysis of the ester function, the deuteration is preferably accomplished by treating the doubly protected glycine derivative with a catalytic amount of sodium in an anhydrous deuteronic solvent such as deuteromethanol (MeOD) or deuteroethanol (EtOD). 
The required backbone labeled amino acids can now be synthesized from the doubly protected glycine derivative or, preferably, its deuterated analogue, by introducing the characteristic sidechain in a stereospecific manner to preserve the L-configuration at the xcex1-carbon chiral center. Methods for such chiral syntheses are known to those skilled in the art. They involve reacting the glycine derivative with a chiral molecule, called a xe2x80x9cchiral auxiliary,xe2x80x9d which directs the subsequent incorporation of the amino acid sidechain in a chiral manner (March, J., Advanced Organic Chemistry, 4th ed., Wiley, N.Y., p. 118, 1992).
In a particularly preferred aspect of the invention, the deuterated glycine analogue is converted to the chiral xe2x80x9csultamxe2x80x9d derivative. See Oppolzer, J. Chem. Soc. Perkin 1: 2503 (1996). For example, methyl or ethyl N-[bis(methylthio)methylidene]glycinate or methyl or ethyl N-(diphenyl methylene)glycinate is treated with (2R)-bornane-10,2-sultam or (2S)-bornane-10,2-sultam in the presence of trimethylaluminum or triethylaluminum and a solvent (usually toluene). (2R)-Bornane-10,2-sultam, ethyl N-(diphenyl methylene)glycinate and trimethylaluminum are particularly preferred for forming the L-amino acids. 
The resulting sultam derivative is then treated with a strong base such as lithium diisopropylamide (xe2x80x9cLDAxe2x80x9d) or n-butyl lithium, in an appropriate solvent such as tetrahydrofuran (xe2x80x9cTHFxe2x80x9d), in the presence of a coordinating solvent such as hexamethylphosphoramide (xe2x80x9cHMPAxe2x80x9d) or N,N-dimethylpropyleneurea (xe2x80x9cDMPUxe2x80x9d) to give the resulting glycine derivative.
To prepare amino acids with simple alkyl sidechains, i.e., alanine, leucine, isoleucine, phenylalanine, methionine, and valine, the derivatized glycine molecule is treated with the appropriate alkyl halide to form the fully protected amino acid. For example, treating the derivatized glycine molecule with benzyl iodide leads to the formation of protected phenylalanine. A list of alkyl halides and corresponding amino acids is provided in Table 1.
The fully protected amino acid thus prepared may be unblocked by a variety of means. The preferred method is a simple two-step procedure consisting of treating the protected amino acid with aqueous acid to remove the imine protecting group, followed by treating the amino acid with an aqueous base to remove the sultam group. In principle, any combination of an aqueous acid and base can be employed, but dilute HCL followed by dilute LiOH is preferred. The liberated, specifically isotopically labeled amino acid may then be further purified by, for instance, ion exchange chromatography. 
To prepare aspartic acid, glutamic acid, tyrosine, histidine and tryptophan, the functional groups present in the sidechains are advantageously protected prior to reaction with the derivatized glycine molecule. Preferably, the derivatized glycine molecule is treated with a previously protected alkyl halide. For example, aspartic and glutamic acid may be prepared via the commercially available tertbutyl bromoacetate (Oppolzer, Helvetica Chimica Acta, 77:2363 (1994) ) and methyl acrylate (Schollkopf, Synthesis, 737 (1986)), respectively. The alkyl ester protecting group is removed by treating the glycine anion with acid during the two-step unblocking procedure described above to give the desired amino acid.
Similarly, tyrosine may be prepared via the commercially available 4-benzyloxybenzyl or 4-methoxybenzyl chloride. The benzyl or methyl protecting group may be removed prior to the two-step unblocking procedure by, for instance, treating the derivatized glycine molecule with trimethyl silyl iodide in a suitable solvent such as dichloromethane.
Protected sidechain precursors for histidine and tryptophan may be prepared, for example, by the reaction shown in Table 2. For the preparation of the histidine precursor, commercially available 4-hydroxymethyl imidazole hydrochloride is protected at the ring amino nitrogen by a suitable protecting group such as t-boc, F-moc, tosyl, etc. The alcohol functional group of the protected molecule is then converted to a suitable leaving group, e.g., the corresponding halide such as bromide, by reacting the alcohol with a suitable brominating agent, such as free bromine, or triphenylphosphine and carbon tetrabromide, in a suitable solvent such as carbon tetrachloride. The protected bromomethylimidazole derivative may then be reacted directly with the derivatized glycine molecule.
Similarly, the required tryptophan precursor may be prepared from commercially available indole-3-carboxaldehyde via protection of the ring nitrogen with a suitable protecting group such as t-boc, F-moc, etc., followed by conversion to the corresponding alcohol by reduction with, for example, sodium borohydride in ethanol, and halogenation as described above. The protected bromomethylindole derivative may then be reacted directly with the derivatized glycine molecule. The production of these heterocyclic halides and corresponding amino acids is illustrated in Table 2.
Fully protected tryptophan and histidine may be unblocked by the simple two-step procedure described above as t-boc, F-moc, or tosyl groups may be removed by the acid/base treatment. Again, in principle any combination of an aqueous acid or base can be employed. However, aqueous HCL followed by LiOH is preferred.
Specifically isotopically labeled asparagine and glutamine may be prepared respectively from labeled aspartic acid and glutamic acid prepared above using established techniques. For example, the techniques described in, U.S. Pat. Nos. 5,393,669 and 5,627,044 may be used. Alternatively, asparagine and glutamine, and arginine and lysine, can be prepared by treating the derivatized glycine molecule with an alkyl halide carrying a terminal nitrile group. For example, treating the derivatized glycine molecule with 3-bromopropionitrile leads to the formation of the corresponding fully protected nitrile derivative. Following unblocking by the two-step acid/base treatment described above, the resulting amino acid nitrites are converted to the desired amino acids. For example, lysine may be formed by reacting 4-bromobutyronitrile with the derivatized glycine molecule and then reducing the resulting nitrile with a suitable reducing agent such as sodium borohydride and cobalt chloride.
A list of amino acids, corresponding halo-alkyl nitrites and methods for their conversion are provided in Table 3.
Preferably, arginine is prepared from the nitrile isolated from the two-step unblocking procedure by reducing the nitrile with sodium borohydride and cobalt chloride, followed by treating the resulting ornithine with O-methylisourea tosylate. The O-methylisourea tosylate compound is prepared from urea treated with methyl tosylate in the presence of basic copper II carbonate, followed by treatment with sodium sulfhydride (Kurtz, J. Biol. Chem., 180:1259 (1949)).
The remaining specifically isotopically labeled amino acids required for a specifically labeled mammalian or insect cell medium, i.e., serine, cysteine and threonine, may be prepared, for example, by the enzymatic procedures described in U.S. Pat. Nos. 5,393,669 and 5,627,044 and the references cited therein using 13C2, 15N glycine and/or 2H2, 13C, 15N glycine as a precursor.
The specifically isotopically labeled amino acids thus prepared may be incorporated into a mammalian or insect cell medium individually or in any combination so that the protein expressed by the cells growing in the medium may be specifically labeled at the amino acid residues of choice. The composition and use of such medium for bacterial, yeast, mammalian and insect cell lines are well known. The compositions described in U.S. Pat. No. 5,324,658 and in U.S. Pat. Nos. 5,393,669 and 5,627,044 may advantageously be used for the media of this invention.
NMR analysis of the specifically labeled protein thus produced may be used to interpret NMR data from the same protein separately obtained in universally labeled form and thereby expedite the determination of the structure of the protein. For instance, application of the HNCA experiment to a specifically labeled protein will enable the maximum sensitivity and resolution to be obtained for the determination of the protein backbone resonance assignments. The Cxcex1 resonance for each amino acid residue will exhibit a correlation with the amide nitrogen atom of the same residue via the one-bond Cxcex1i-Ni coupling, which is then transferred to the amide proton using another transfer via the one-bond Ni-Hi coupling. In addition, certain residues will exhibit a two-bond Cxcex1i-1-(Ci-1)-Ni correlation to the previous residue in such cases where this two-bond coupling is of sufficient magnitude. These latter data can be complemented by data from an experiment known as HN(CO)CA which exhibits exclusively all such two-bond correlations due to transfer via the intervening carbonyl carbon. This latter experiment also shares the advantages gained by the HNCA experiments with respect to selective labeling. Hence, the HNCA and HN(CO)CA experiments combined, can be used sequentially to assign the backbone resonances of proteins with high-sensitivity, and with sufficient resolution to permit automated analysis with computational algorithms.