For many years, there has been intense interest in the determination of the three-dimensional structure of biological macromolecules, particularly proteins. So-called "structure-function" studies have been carried out with a view to determining which structural features of a molecule, or class of molecules, are important for biological activity. Since the pioneering work of Nobel laureates, Perutz and coworkers on the structure of hemoglobin (Perutz, M. F. et al., Nature, 185, 416-422 (1960)) and Watson and Crick on the structure of DNA (Watson, J. D. and Crick, F. H. C., Nature, 171, 737 (1953)), this field has been of major importance in the biological sciences.
More recently, there has evolved the concept of "rational drug design." This strategy for the design of drugs involves the determination of the three-dimensional structure of an "active part" of a particular biological molecule, such as a protein. The biological molecule may, for example, be a receptor, an enzyme, a hormone, or other biologically active molecule. Knowing the three-dimensional structure of the active site can enable scientists to design molecules that will block, mimic or enhance the natural biological activity of the molecule. (Appelt, K., et al., J. Med. Chem., 34, 1925 (1991)). The determination of the three-dimensional structure of biological molecules is therefore also of great practical and commercial significance.
The first technique developed to determine three-dimensional structures was X-ray crystallography. The structures of hemoglobin and DNA were both determined using this technique. X-ray crystallography involves bombarding a crystal of the material to be examined with a beam of X-rays which are refracted by the atoms of the ordered molecules in the crystal. The scattered X-rays are captured on a photographic plate, which is then developed using standard techniques. The diffracted X-rays are thus visualized as a series of spots on the plate, and from this pattern, the structure of the molecules in the crystal can be determined. For larger molecules, it is also necessary to crystallize the material with a heavy ion, such as ruthenium, in order to remove ambiguity due to phase differences.
More recently, another technique, nuclear magnetic resonance ("NMR") spectroscopy, has been developed to determine the three-dimensional structures of biological molecules, and particularly proteins. NMR spectroscopy was originally developed in the 1950's and has evolved into a powerful procedure for analyzing the structure of small compounds, such as those with a molecular weight of .ltoreq.1000 daltons. Briefly, the technique involves placing the material (usually in a suitable solvent) in a powerful magnetic field and irradiating it with a strong radio signal. The nuclei of the various atoms will align themselves with the magnetic field until energized by the radio signal. They then absorb this energy and re-radiate (resonate) it at a frequency dependent on i) the type of nucleus and ii) the chemical environment (determined largely by bonding) of the nucleus. Moreover, resonances can be transmitted from one nucleus to another, either through bonds or through three dimensional space, thus giving information about the environment of a particular nucleus and nuclei in the vicinity of it.
However, it is important to recognize that not all nuclei are NMR active. Indeed, not all isotopes of the same element are active. For example, whereas "ordinary" hydrogen, .sup.1 H, is NMR active, heavy hydrogen (deuterium), .sup.2 H, is not. Thus, any material that normally contains .sup.1 H hydrogen can be rendered "invisible" in the hydrogen NMR spectrum by replacing all the .sup.1 H hydrogens with .sup.2 H. It is for this reason that NMR spectra of water-soluble materials are determined in solution in .sup.2 H.sub.2 O, so as to avoid the water signal.
Conversely, "ordinary" carbon, .sup.12 C is NMR inactive whereas the stable isotope .sup.13 C, present to about 1% of total carbon in nature, is active. Similarly, "ordinary" nitrogen, .sup.14 N, is NMR inactive whereas the stable isotope .sup.15 N, again present to about 1% of total nitrogen in nature, is active. For small molecules, it was found that these low level natural abundancies were sufficient to generate the required experimental information, provided that the experiment was conducted with sufficient quantities of materials and for sufficient time.
As advances in hardware and software were made, the size of molecules that could be analyzed by these techniques increased to about 10,000 Daltons, the size of a small protein. The application of NMR spectroscopy to protein structural determinations therefore began only a few years ago. It was quickly realized that this size limit could be raised by substituting the NMR active stable isotopes .sup.15 N and .sup.13 C into the proteins in place of the NMR inactive isotopes .sup.14 N and .sup.12 C. A method of achieving this substitution was to grow microorganisms capable of producing the proteins in growth media labeled with these isotopes.
Over the past two or three years, .sup.15 N-labeling and .sup.13 C-labeling of proteins, have raised the analytical size limit to approximately 15 kd and 30 kD (40 kD projected) respectively. This isotopic substitution has been accomplished by growing a bacterium or yeast, transformed by genetic engineering to produce the protein of choice, in a growth medium containing and/or .sup.15 N labeled substrates. In practice, these media usually consist of .sup.13 C labeled glucose and/or .sup.15 N labeled ammonium salts. (Kay, L. et al., Science, 249, 411 (1990) and references therein.) Recently, bacterial and yeast nutrient media containing labeled protein hydrolyzates have been described. See International Patent Application, publication no. WO 90/15525, published Dec. 27, 1990.
Heretofore, compositions and methods for NMR structural determinations have suffered from a significant limitation. Most proteins of interest in structure-function studies are mammalian in origin. Moreover, virtually all proteins of interest in rational human drug design are mammalian, i.e., human, in origin. Yet neither X-ray crystallography nor NMR spectroscopy have had widespread use in examining proteins produced from mammalian cells. X-ray crystallography, by definition, requires crystalline material, yet mammalian cell proteins are notoriously difficult to crystallize. To date, only a few antibodies and mammalian cell-derived receptors have been crystallized in a form suitable for crystallography. Those that have been crystallized have usually been selected fragments of a molecule. Information derived from molecular fragments is viewed with caution, as it is never known whether the structure of the part of the main molecule on its own is the same as that of that part of the molecule in the whole molecule. Moreover, X-ray crystallography is inapplicable in those frequent instances in which crystalline material cannot be obtained.
NMR structural studies have hitherto been limited by the necessity of expressing the labeled proteins in bacteria or yeast. However, most mammalian proteins contain significant post-translational modifications that cannot be effected in bacterial and yeast systems. That is to say, they are appropriately folded and cross-linked with disulfide bridges, may have attached side chains of oligosaccharides and may be proteolytically cleaved to active forms. Bacterial or yeast-produced proteins frequently do not possess the biological activity of mammalian cell-produced proteins. Indeed, in some cases, mammalian proteins cannot be produced in bacteria at all. For these reasons, the biotechnology industry moved from bacterial expression systems to mammalian ones in the mid 1980's to produce recombinant therapeutic proteins, such as tissue plasminogen activator, Factor VIII:C, erythropoietin and the like. Parts of some mammalian cell proteins have been studied by NMR by cloning the gene for a fragment of the molecule of interest into a bacterium, and expressing the fragment in isotopically labeled form by growth of the bacterium in an isotopically labeled medium. Again, only those parts of a molecule of choice that can be expressed in bacteria have been susceptible to study in these systems (e.g. see Driscoll, P. C., et al., Nature, 353, Oct. 24, 1991). Because of the lack of post-translational modifications inherent in bacterial expression, the molecular parts examined have been produced in the absence of such post-translational modifications such as glycosylation etc., again leading to doubt as to the value of the structures obtained. As with X-ray crystallography, there have also been subsequent doubts as to the value of structural information obtained from protein fragments.
Host-vector systems utilizing both mammalian cells and insect cells have been developed. Mammalian cell lines, such as Chinese hamster ovary (CHO) cells, COS cells and insect-cell lines, such as the Spodoptera frugiperda cell lines SF9 and SF21 (Luckow, V. A. and Summers, M. D., Biotechnology, 6 47-55 (1988)), have been found to produce recombinant mammalian proteins with post-translational modifications similar to those of the natural protein.
NMR studies on mammalian and insect cell-produced proteins have been of limited value, as no means of universally incorporating stable isotopes such as .sup.13 C or both .sup.13 C and .sup.15 N in an analogous manner to that for bacteria have been available. Whereas bacteria can grow on a simple mixture of glucose and salts, mammalian and insect cells require, in addition to glucose, all of the amino acids essential for growth. For instance, for the successful production of a universally .sup.13 C and/or .sup.15 N labeled protein from mammalian cells all of these amino acids would have to be present and all would have to be universally labeled with .sup.13 C and/or .sup.15 N.
One theoretical way of producing an isotopically labeled medium would be to use a simple hydrolysate of an isotopically labeled protein. Unfortunately, hydrolysis of proteins to the constituent amino acids also leads to the concomitant formation of side products that are toxic to mammalian cells. Use of unpurified hydrolysates has been found to lead to rapid death of the cells. Moreover, conventional hydrolysis procedures destroy certain essential amino acids, and available means for preventing such destruction often result in toxic effects. On the other hand, techniques for the isolation and purification of individual amino acids are known. For example, LeMaster and coworkers published (Anal. Biochem., 122, 238 (1982)) a paper describing the purification of .sup.2 H and .sup.15 N amino acids. No fewer than five column chromatographic steps were required, and even then these workers were unable to isolate fully labeled cysteine and glutamine, while yields of tryptophan were "erratic." All three of these amino acids are essential for the growth of most mammalian and insect cell lines used as host cells for production of recombinant proteins. Moreover, the procedure utilized piperidine as a prime eluant of the amino acids from the preparative chromatography columns. Piperidine has been reported to be a highly toxic, controlled substance.
The procedures for the purification of the individual amino acids are thus complicated, time-consuming and low-yield and hence are uneconomical. Consequently, while some .sup.13 C and/or .sup.15 N amino acids are commercially available, albeit only in small quantities and only on occasion, most are not.
Recently, Fesik and coworkers have described a method for the production of isotopically labeled proteins from mammalian cells for NMR structural studies. (Biochemistry, 31, no 51, 12713, (1992)) These workers hydrolyzed both isotopically labeled algal and bacterial proteins with methanesulfonic acid in the presence of tryptamine and imidazole. The purpose of the latter reagents was to serve as "suicide bases" to reduce the destruction of the amino acids tryptophan and histidine respectively. The hydrolysate was then purified by the procedure described by LeMaster and coworkers; namely, by loading the hydrolysate onto a cation exchange column in the H+ form and eluting the amino acids, as a group, from the column with piperidine. The amino acid-containing fractions were combined, evaporated to dryness, redissolved in water, the pH adjusted to 11.5 with sodium hydroxide, and the resulting solution evaporated until the pH remained constant, "indicating that no more ammonia or piperidine was being removed." The amino acids were then filtered through a 500 molecular weight cutoff membrane to remove further impurities and lyophilized. The authors do not indicate whether the resulting amino acids were used directly (i.e. at high pH) or whether the pH of the solution was neutralized, and if so, with which acid. The Fesik et al. work, while representing a technological advance, nevertheless fails to provide a means for universally labeling mammalian cell expressed proteins useful for unambiguous NMR structural determinations. Firstly, the hydrolysis conditions employed destroy asparagine, glutamine and cysteine residues and leave just a "trace" of tryptophan (page 12715, Table 1). Secondly, the procedure employs piperidine as the eluant which is, as noted above, a toxic and controlled substance. Thirdly, LeMaster reports in his original paper that one of the "suicide bases," imidazole co-elutes with the amino acid leucine. LeMaster was able to remove the imidazole by crystallization of leucine. Fesik et al. do not describe such a crystallization step, and indeed, such a step would be impossible in the Fesik et al. procedure where the individual amino acids are not resolved.
Fesik et al. describe the removal of the piperidine eluant by raising the pH of the solution to 11.5 and heat evaporating the solution until the pH remained constant. At this pH, and particularly at the elevated temperatures necessary to remove piperidine (boiling point 106.degree. C.), there is a risk of racemization and/or nucleophilic attack of the amino acids by the piperidine/sodium hydroxide mixture. Such reactions will reduce the amounts of viable amino acids in the mixture, reducing its efficiency as a growth medium. Moreover, as the authors themselves acknowledge, the heat evaporation step is stopped when a stable solution pH indicates "that no more ammonia or piperidine was being removed." It is therefore possible that the mixture of amino acids obtained will contain trace amounts of piperidine, a highly toxic material.
Of more significance however, are the absence of the amino acids asparagine, glutamine and cysteine and the presence of just a "trace" of tryptophan (page 12715, Table 1). Although the lack of asparagine residues was found to be unimportant in the systems investigated by Fesik et al., glutamine was found to be vital for cell growth (page 12716, FIG. 2). The authors provide a method of enzymatically synthesizing glutamine from glutamic acid as a supplement. However, for this reaction to be of value, a source of appropriately labeled glutamic acid has to be available. As the authors note, .sup.13 C, .sup.15 N labeled glutamic acid is commercially available. However, .sup.2 H-labeled glutamic acid, for instance, is not. There is great interest in obtaining .sup.2 H-labeled proteins for ligand/receptor studies (see below).
By contrast, Fesik provides no method for the preparation of labeled cysteine. Cysteine labeled with a stable isotope has been commercially available only in .sup.15 N-labeled form. Consequently, the approach adopted by Fesik and coworkers will not lead to universally labeled products in any case, except for simple .sup.15 N-labeling, as the cysteine and tryptophan residues will not be appropriately labeled. It is possible, moreover, that isotopic leakage of undesired isotope will occur from the incorrectly labeled cysteine residues into other amino acid residues by cellular metabolism.
Similarly, the "trace" amounts of tryptophan present in the mixture were insufficient for cell growth without supplementation (page 12715, Table 1; page 12716, FIGS. 1 and 2). Although .sup.15 N-labeled tryptophan is commercially available, neither .sup.13 C, .sup.15 N- nor .sup.2 H-labeled tryptophan is available. Thus for tryptophan to be introduced as a supplement for any labeling experiment other than for simple .sup.15 N-labeling will lead to the same problems associated with the absence of a suitable labeled cysteine residue, namely incomplete isotopic labeling.
Thus, the method provided by Fesik and coworkers will lead to the universal isotopic labeling of proteins only in the case of .sup.15 N-labeled proteins. Although the method is an advance, .sup.15 N-labeled amino acids are already available, as previously indicated. In the case of .sup.13 C, .sup.15 N-labeling experiments, cysteine and tryptophan residues will not be universally labeled, while in the case of .sup.2 H-labeling experiments, cysteine, tryptophan and glutamine residues will be incorrectly labeled.
There has also recently been published a paper by Hsu and Armitage (Biochemistry, 31 (51) 12778 (1992)) concerning the NMR determination of the structure of the immunosuppressant drug cyclosporin A bound to its receptor, cyclophilin. These workers labeled cyclophilin, expressed in bacteria, with the NMR inactive isotope .sup.2 H. They were thus able to examine the structure of the cyclosporin A/cyclophilin complex unencumbered with the signals from the cyclophilin. Given the importance of mammalian ligand/receptor interactions, there is thus also a requirement for mammalian cell proteins, particularly receptors, to be universally labeled with .sup.2 H. Heretofore, labeled mammalian nutrient media for accomplishing this goal have been unavailable.
Accordingly, for both structure-function studies in general and for rational drug design in particular, there is a need for universally labeled compositions and methods for determining the three-dimensional structures of mammalian cell proteins, and protein complexes. There is consequently a need for producing mammalian cell proteins labeled with a range of stable isotopes in universally labeled form.