For many years, there has been intense interest in the determination of the three-dimensional structure of biological macromolecules, particularly proteins. So-called "structure-function" studies have been carried out with a view to determining which structural features of a molecule, or class of molecules, are important for biological activity. Since the pioneering work of Nobel laureates, Perutz and coworkers on the structure of hemoglobin (Perutz, M. F. et al., Nature, 185, 416-422 (1960)) and Watson and Crick on the structure of DNA (Watson, J. D. and Crick, F. H. C.,Nature, 171, 737 (1953)), this field has been of major importance in the biological sciences.
More recently, there has evolved the concept of "rational drug design." This strategy for the design of drugs involves the determination of the three-dimensional structure of an "active part" of a particular biological molecule, such as a protein. The biological molecule may, for example, be a receptor, an enzyme, a hormone, or other biologically active molecule. Knowing the three-dimensional structure of the active site can enable scientists to design molecules that will block, mimic or enhance the natural biological activity of the molecule. (Appelt, K., et al., J. Med. Chem., 34, 1925 (1991)). The determination of the three-dimensional structure of biological molecules is therefore also of great practical and commercial significance.
The first technique developed to determine three-dimensional structures was X-ray crystallography. The structures of hemoglobin and DNA were both determined using this technique. X-ray crystallography involves bombarding a crystal of the material to be examined with a beam of X-rays which are refracted by the atoms of the ordered molecules in the crystal. The scattered X-rays are captured on a photographic plate, which is then developed using standard techniques. The diffracted X-rays are thus visualized as a series of spots on the plate, and from this pattern, the structure of the molecules in the crystal can be determined. For larger molecules, it is also necessary to crystallize the material with a heavy ion, such as ruthenium, in order to remove ambiguity due to phase differences.
More recently, another technique, nuclear magnetic resonance ("NMR") spectroscopy, has been developed to determine the three-dimensional structures of biological molecules, and particularly proteins. NMR spectroscopy was originally developed in the 1950's and has evolved into a powerful procedure for analyzing the structure of small compounds, such as those with a molecular weight of .ltoreq.1000 daltons. Briefly, the technique involves placing the material (usually in a suitable solvent) in a powerful magnetic field and irradiating it with a strong radio signal. The nuclei of the various atoms will align themselves with the magnetic field until energized by the radio signal. They then absorb this energy and re-radiate (resonate) it at a frequency dependent on i) the type of nucleus and ii) the chemical environment (determined largely by bonding) of the nucleus. Moreover, resonances can be transmitted from one nucleus to another, either through bonds or through three dimensional space, thus giving information about the environment of a particular nucleus and nuclei in the vicinity of it.
However, it is important to recognize that not all nuclei are NMR active. Indeed, not all isotopes of the same element are active. For example, whereas "ordinary" hydrogen, .sup.1 H, is NMR active, heavy hydrogen (deuterium), .sup.2 H, is not. Thus, any material that normally contains .sup.1 H hydrogen can be rendered "invisible" in the hydrogen NMR spectrum by replacing all the .sup.1 H hydrogens with .sup.2 H. It is for this reason that NMR spectra of water-soluble materials are determined in solution in .sup.2 H.sub.2 O, so as to avoid the water signal.
Conversely, "ordinary" carbon, .sup.12 C is NMR inactive whereas the stable isotope .sup.13 C, present to about 1% of total carbon in nature, is active. Similarly, "ordinary" nitrogen, .sup.14 N, is NMR inactive whereas the stable isotope .sup.15 N, again present to about 1% of total nitrogen in nature, is active. For small molecules, it was found that these low level natural abundancies were sufficient to generate the required experimental information, provided that the experiment was conducted with sufficient quantities of materials and for sufficient time.
As advances in hardware and software were made, the size of molecules that could be analyzed by these techniques increased to about 10,000 daltons, the size of a small protein. The application of NMR spectroscopy to protein structural determinations therefore began only a few years ago. It was quickly realized that this size limit could be raised by substituting the NMR active stable isotopes .sup.15 N and .sup.13 C into the proteins in place of the NMR inactive isotopes .sup.14 N and .sup.12 C. A method of achieving this substitution was to grow microorganisms capable of producing the proteins in growth media labeled with these isotopes.
Over the past two or three years, .sup.15 N-labeling and .sup.13 C-labeling of proteins, have raised the analytical size limit to approximately 15 kd and 25 kd respectively. This isotopic substitution has been accomplished by growing a bacterium or yeast, transformed by genetic engineering to produce the protein of choice, in a growth medium containing .sup.13 C and/or .sup.15 N labeled substrates. In practice, these media usually consist of .sup.13 C labeled glucose and/or .sup.15 N labeled ammonium salts. (Kay, L. et al., Science, 249, 411 (1990) and references therein.) Recently, bacterial and yeast nutrient media containing labeled protein hydrolyzates have been described. See International Patent Application, publication no. WO 90/15525, published Dec. 27, 1990.
While .sup.13 C and .sup.15 N labeling has enabled NMR structural determinations for proteins substantially larger than those previously amenable to such techniques, proteins larger than about 25 kd present ambiguous results. At this size, many of the resonances from the individual atoms become too broad to resolve. In a recent publication, it has been reported that triple-labeling, i.e., the partial incorporation of deuterium, .sup.2 H, as well as .sup.13 C and .sup.15 N isotopes, narrowed significantly the otherwise broadened lines in a larger molecule. Bax, J. Am. Chem. Soc., 115:4369 (1993). Triple-labeled media are therefore preferred for the preparation of labeled forms of proteins large than about 25 kd for NMR structural determinations. For bacterial proteins, partial .sup.2 H-labeling can be achieved by culturing the bacteria in the presence of a mixture of H.sub.2 O and .sup.2 H.sub.2 O. This approach is unsatisfactory, however, for the production of suitably labeled mammalian proteins.
Heretofore, compositions and methods for NMR structural determinations have suffered from a significant limitation. Most proteins of interest in structure-function studies are mammalian in origin. Moreover, virtually all proteins of interest in rational human drug design are mammalian, i.e., human, in origin. Yet neither X-ray crystallography nor NMR spectroscopy have had widespread use in examining proteins produced from mammalian cells. X-ray crystallography, by definition, requires crystalline material, yet mammalian cell proteins are notoriously difficult to crystallize. To date, only a few antibodies and mammalian cell-derived receptors have been crystallized in a form suitable for crystallography. Those that have been crystallized have usually been selected fragments of a molecule. Information derived from molecular fragments is viewed with caution, as it is never known whether the structure of the part of the main molecule on its own is the same as that of that part of the molecule in the whole molecule. Moreover, X-ray crystallography is inapplicable in those frequent instances in which crystalline material cannot be obtained.
NMR structural studies have hitherto been limited by the necessity of expressing the labeled proteins in bacteria or yeast. However, most mammalian proteins contain significant post-translational modifications that cannot be effected in bacterial and yeast systems. That is to say, they are appropriately folded and cross-linked with disulfide bridges, may have attached side chains of oligosaccharides and may be proteolytically cleaved to active forms. Bacterial or yeast-produced proteins frequently do not possess the biological activity of mammalian cell-produced proteins. Indeed, in some cases, mammalian proteins cannot be produced in bacteria at all. For these reasons, the biotechnology industry moved from bacterial expression systems to mammalian ones in the mid 1980's to produce recombinant therapeutic proteins, such as tissue plasminogen activator, Factor VIII:C, erythropoietin and the like. Parts of some mammalian cell proteins have been studied by NMR by cloning the gene for a fragment of the molecule of interest into a bacterium, and expressing the fragment in isotopically labeled form by growth of the bacterium in an isotopically labeled medium. Again, only those parts of a molecule of choice that can be expressed in bacteria have been susceptible to study in these systems (e.g. see Driscoll, P. C., et al., Nature, 353, Oct. 24, 1991). Because of the lack of post-translational modifications inherent in bacterial expression, the molecular parts examined have been produced in the absence of such post-translational modifications such as glycosylation etc., again leading to doubt as to the value of the structures obtained. As with X-ray crystallography, there have also been subsequent doubts as to the value of structural information obtained from protein fragments.
Host-vector systems utilizing both mammalian cells and insect cells have been developed. Mammalian cell lines, such as Chinese hamster ovary (CHO) cells, COS cells and insect-cell lines, such as the Spodoptera frugiperda cell lines SF9 and SF21 (Luckow, V. A. and Summers, M. D., Biotechnology, 6 47-55 (1988)), have been found to produce recombinant mammalian proteins with post-translational modifications similar to those of the natural protein.
NMR studies on mammalian and insect cell-produced proteins have been of limited value, as no means of universally incorporating stable isotopes such as .sup.13 C or both .sup.13 C and .sup.15 N in an analogous manner to that for bacteria have been available. Whereas bacteria can grow on a simple mixture of glucose and salts, mammalian and insect cells require, in addition to glucose, all of the amino acids essential for growth. For instance, for the successful production of a universally .sup.13 C and/or .sup.15 N labeled protein from mammalian cells all of these amino acids would have to be present and all would have to be universally labeled with .sup.13 C and/or .sup.15 N.
One theoretical way of producing an isotopically labeled medium would be to use a simple hydrolysate of an isotopically labeled protein. Unfortunately, hydrolysis of proteins to the constituent amino acids also leads to the concomitant formation of side products that are toxic to mammalian cells. Use of unpurified hydrolysates has been found to lead to rapid death of the cells. Moreover, conventional hydrolysis procedures destroy certain essential amino acids, and available means for preventing such destruction often result in toxic effects. On the other hand, techniques for the isolation and purification of individual amino acids are known. For example, LeMaster and coworkers published (Anal. Biochem., 122, 238 (1982)) a paper describing the purification of .sup.2 H and .sup.15 N amino acids. No fewer than five column chromatographic steps were required, and even then these workers were unable to isolate fully labeled cysteine and glutamine, while yields of tryptophan were "erratic." All three of these amino acids are essential for the growth of most mammalian and insect cell lines used as host cells for production of recombinant proteins. Moreover, the procedure utilized piperidine as a prime eluant of the amino acids from the preparative chromatography columns. Piperidine has been reported to be a highly toxic, controlled substance.
The procedures for the purification of the individual amino acids are thus complicated, time-consuming and low-yield and hence are uneconomical. Consequently, while some .sup.13 C and/or .sup.15 N amino acids are commercially available, albeit only in small quantities and only on occasion, most are not.
Recently, Fesik and coworkers have described a method for the production of isotopically labeled proteins from mammalian cells for NMR structural studies. (Biochemistry, 31, no 51, 12713, (1992)) These workers hydrolyzed both isotopically labeled algal and bacterial proteins with methanesulfonic acid in the presence of tryptamine and imidazole. The purpose of the latter reagents was to serve as "suicide bases" to reduce the destruction of the amino acids tryptophan and histidine respectively. The hydrolysate was then purified by the procedure described by LeMaster and coworkers; namely, by loading the hydrolysate onto a cation exchange column in the H+ form and eluting the amino acids, as a group, from the column with piperidine. The amino acid-containing fractions were combined, evaporated to dryness, redissolved in water, the pH adjusted to 11.5 with sodium hydroxide, and the resulting solution evaporated until the pH remained constant, "indicating that no more ammonia or piperidine was being removed." The amino acids were then filtered through a 500 molecular weight cutoff membrane to remove further impurities and lyophilized. The authors do not indicate whether the resulting amino acids were used directly (i.e. at high pH) or whether the pH of the solution was neutralized, and if so, with which acid. The Fesik et al. work, while representing a technological advance, nevertheless fails to provide a means for universally labeling mammalian cell expressed proteins useful for unambiguous NMR structural determinations. Firstly, the hydrolysis conditions employed destroy asparagine, glutamine and cysteine residues and leave just a "trace" of tryptophan (page 12715, Table 1). Secondly, the procedure employs piperidine as the eluant which is, as noted above, a toxic and controlled substance. Thirdly, LeMaster reports in his original paper that one of the "suicide bases," imidazole co-elutes with the amino acid leucine. LeMaster was able to remove the imidazole by crystallization of leucine. Fesik et al. do not describe such a crystallization step, and indeed, such a step would be impossible in the Fesik et al. procedure where the individual amino acids are not resolved.
Fesik et al. describe the removal of the piperidine eluant by raising the pH of the solution to 11.5 and heat evaporating the solution until the pH remained constant. At this pH, and particularly at the elevated temperatures necessary to remove piperidine (boiling point 106.degree. C), there is a risk of racemization and/or nucleophilic attack of the amino acids by the piperidine/sodium hydroxide mixture. Such reactions will reduce the amounts of viable amino acids in the mixture, reducing its efficiency as a growth medium. Moreover, as the authors themselves acknowledge, the heat evaporation step is stopped when a stable solution pH indicates "that no more ammonia or piperidine was being removed." It is therefore possible that the mixture of amino acids obtained will contain trace amounts of piperidine, a highly toxic material.
Of more significance however, are the absence of the amino acids asparagine, glutamine and cysteine and the presence of just a "trace" of tryptophan (page 12715, Table 1). Although the lack of asparagine residues was found to be unimportant in the systems investigated by Fesik et al., glutamine was found to be vital for cell growth (page 12716, FIG. 2). The authors provide a method of enzymatically synthesizing glutamine from glutamic acid as a supplement. However, for this reaction to be of value, a source of appropriately labeled glutamic acid has to be available. As the authors note, .sup.13 C, .sup.15 N labeled glutamic acid is commercially available. However, Triple-labeled glutamic acid, for instance, is not.
By contrast, Fesik provides no method for the preparation of labeled cysteine. Cysteine labeled with a stable isotope has been commercially available only in .sup.15 N-labeled form. Neither double-labeled cysteine nor triple labeled .sup.2 H, .sup.13 C, .sup.15 N-cysteine have heretofore been available. Consequently, the approach adopted by Fesik and coworkers will not lead to universally labeled products in any case, except for simple .sup.15 N-labeling, as the cysteine and tryptophan residues will not be appropriately labeled. It is possible, moreover, that isotopic leakage of undesired isotope will occur from the incorrectly labeled cysteine residues into other amino acid residues by cellular metabolism.
In principle, the simplest way to produce labeled, including triple-labeled, cysteine, is to culture an organism which is rich in cysteine in the appropriately labeled medium, and to isolate the cysteine from the proteins of that organism. Such organisms will be familiar to those skilled in the art, and include purple sulphur bacteria such as Rhodopseudomonas speroides and capsulata, other cysteine rich organisms such as Leptothrix discophora and Schizophyllum commune, and bacteria engineered to produce cysteine rich proteins such as ATCC 31448, an E. coli engineered to express human insulin A Chain.
The cysteine would then be isolated from the protein by hydrolysis. However, the only known ways to hydrolyze proteins without concomitant destruction of cysteine are i) hydrolysis under alkaline conditions (See Okuda, Pr. Acad. Tokyo, 2, 277) and ii) by enzymatic hydrolysis. Unfortunately, both of these procedures are unsuitable for the production of labeled, and in particular triple-labeled cysteine. Hydrolysis under alkaline conditions can lead to racemization of the required L-cysteine, and also to destruction of several other valuable isotopically labeled amino acids. Enzymatic hydrolysis carries the risk of isotopic contamination from enzyme breakdown products, especially if prolonged hydrolysis times are required, as is usually the case.
As with cysteine, the "trace" amounts of tryptophan present in the mixture employed by Fesik et al. were insufficient for cell growth without supplementation (page 12715, Table 1; page 12716, FIGS. 1 and 2). Although .sup.15 N-labeled tryptophan is commercially available, neither .sup.13 C, .sup.15 N nor triple-labeled tryptophan is available. Thus for tryptophan to be introduced as a supplement for any labeling experiment other than for simple .sup.15 N-labeling will lead to the same problems associated with the absence of a suitable labeled cysteine residue, namely incomplete isotopic labeling.
Thus, the method provided by Fesik and coworkers will lead to the universal isotopic labeling of proteins only in the case of .sup.15 N-labeled proteins. Although the method is an advance, .sup.15 N-labeled amino acids are already available, as previously indicated. In the case of .sup.13 C, .sup.15 N-labeling experiments, cysteine and tryptophan residues will not be universally labeled, while in the case of .sup.2 H-labeling experiments and triple labeling experiments, cysteine, tryptophan and glutamine residues will be incorrectly labeled.
A further disadvantage of the use of protein hydrolysis procedures for preparing culture media for mammalian cells is that, under hydrolysis conditions, the amino acids, glutamine and asparagine, present in the starting protein, may be converted to glutamic acid and aspartic acid, respectively. Most mammalian cell media contain small quantities of glutamic acid and substantial quantities of glutamine. These media have been developed to produce optimum performance of mammalian cell lines in terms of cell viability and production. Indeed, nearly all mammalian cell lines producing proteins of interest for NMR analysis will have been conditioned in media containing low levels of glutamic acid and high levels of glutamine.
To be applicable for use with the widest possible range of cell lines, therefore, media for the isotopic labeling of mammalian cell proteins advantageously contain small proportions of glutamic acid and larger proportions of glutamine. Similarly, media preferably contain little or no aspartic acid and larger proportions of asparagine.
There is therefore a need for methods to remove the glutamic acid and/or aspartic acid specifically from a mixture of amino acids without altering the proportions of the other amino acids present, convert the thus isolated glutamic acid and/or aspartic acid to the appropriately labeled glutamine and/or asparagine, and supplement the mixture of amino acids with the glutamine and/or asparagine thus obtained.
No procedures for the specific removal of glutamic acid from mixtures of amino acids have been described in the art. Enzymatic techniques for the conversion of glutamic acid to glutamine have been published (Fesik, et al., Biochemistry, 31(51), 12713 (1992)). Unfortunately, the reactions are slow (3-4 days) and the accompanying breakdown of the enzyme, leading to contamination with natural abundance amino acids, cannot be ruled out. Moreover, in the case of triple labeled mixtures of amino acids, i.e. those partially labeled with .sup.2 H as well as universally labeled with .sup.13 C and .sup.15 N, the presence of the .sup.2 H atoms would be expected to slow the enzymic conversion of glutamic acid labeled with .sup.2 H to glutamine still further, due to the isotope effect of .sup.2 H. Finally, no enzymatic procedure for the conversion of aspartic acid to asparagine has been described in the art.
There has also recently been published a paper by Hsu and Armitage (Biochemistry, 31 (51) 12778 (1992)) concerning the NMR determination of the structure of the immunosuppressant drug cyclosporin A bound to its receptor, cyclophilin. These workers labeled cyclophilin, expressed in bacteria, with the NMR inactive isotope .sup.2 H. They were thus able to examine the structure of the cyclosporin A/cyclophilin complex unencumbered with the signals from the cyclophilin. Given the importance of mammalian ligand/receptor interactions, there is thus also a requirement for mammalian cell proteins, particularly receptors, to be universally labeled with .sup.2 H. Heretofore, labeled mammalian nutrient media for accomplishing this goal have been unavailable.
Accordingly, for both structure-function studies in general and for rational drug design in particular, there is a need for universally labeled compositions and methods for determining the three-dimensional structures of mammalian cell proteins, and protein complexes. There is consequently a need for producing mammalian cell proteins labeled with a range of stable isotopes in universally labeled form.