I. Ribosomes: Structure, Function, and Composition
Ribosomes are ribonucleoproteins which are present in both prokaryotes and eukaryotes. They comprise about two-thirds RNA and one-third protein. Ribosomes are the cellular organelles responsible for protein synthesis. During gene expression, ribosomes translate the genetic information encoded in a messenger RNA into protein (Garrett et al (2000) “The Ribosome: Structure, Function, Antibiotics and Cellular Interactions, “American Society for Microbiology, Washington, D.C.).
Ribosomes comprise two nonequivalent ribonucleoprotein subunits. The larger subunit (also known as the “large ribosomal subunit”) is about twice the size of the smaller subunit (also known as the “small ribosomal subunit”). The small ribosomal subunit binds messenger RNA (mRNA) and mediates the interactions between mRNA and transfer RNA (tRNA) anticodons on which the fidelity of translation depends. The large ribosomal subunit catalyzes peptide bond formation—the peptidyl-transferase reaction of protein synthesis—and includes (at least) two different tRNA binding sites: the A-site which accommodates the incoming aminoacyl-tRNA, which is to contribute its amino acid to the growing peptide chain, and the P-site which accommodates the peptidyl-tRNA complex, i.e., the tRNA linked to all the amino acids that have so far been added to the peptide chain. The large ribosomal subunit also includes one or more binding sites for G-protein factors that assist in the initiation, elongation, and termination phases of protein synthesis. The large and small ribosomal subunits behave independently during the initiation phase of protein synthesis; however, they assemble into complete ribosomes when elongation is about to begin.
The molecular weight of the prokaryotic ribosome is about 2.6×106 daltons. In prokaryotes, the small ribosomal subunit contains a 16S (Svedberg units) ribosomal RNA (rRNA) having a molecular weight of about 5.0×105 daltons. The large ribosomal subunit contains a 23S rRNA having a molecular weight of about 1.0×106 daltons and a 5S rRNA having a molecular weight of about 4.0×105 daltons. The prokaryotic small subunit contains about 20 different proteins and its large subunit contains about 35 proteins. The large and small ribosomal subunits together constitute a 70S ribosome in prokaryotes.
Eukaryotic ribosomes generally are bigger than their prokaryotic counterparts. In eukaryotes, the large and small subunits together make an 80S ribosome. The small subunit of a eukaryotic ribosome includes a single 18S rRNA, while the large subunit includes a 5S rRNA, a 5.8S rRNA, and a 28S rRNA. The 5.8S rRNA is structurally related to the 5′ end of the prokaryotic 23S rRNA, and the 28S rRNA is structurally related to the remainder of the prokaryotic 23S rRNA (Moore (1998) Annu. Rev. Biophys. 27: 35-58). Eukaryotic ribosomal proteins are qualitatively similar to the prokaryotic ribosomal proteins; however, the eukaryotic proteins are bigger and more numerous (Moore (1998) supra).
II. Structural Conservation of the Large Ribosomal Subunit
While the chemical composition of large ribosomal subunits vary from species to species, the sequences of their components provide unambiguous evidence that they are similar in three-dimensional structure, function in a similar manner, and are related evolutionarily. The evolutionary implications of rRNA sequence data available are reviewed in the articles of Woese and others in part II of Ribosomal RNA, Structure, Evolution, Processing and Function in Protein Biosynthesis, (Zimmermann and Dahlberg, eds., CRC Press, Boca Raton, Fla., 1996). The article by Garret and Rodriguez-Fonseca in part IV of the same volume discusses the unusually high level of sequence conservation observed in the peptidyl transferase region of the large ribosomal subunit. The ribosomes of archeal species like Haloarcula marismortui resemble those obtained from eubacterial species like E. coli in size and complexity. However, the proteins in H. marismortui ribosomes are more closely related to the ribosomal proteins found in eukaryotes (Wool et al. (1995) Biochem. Cell Biol. 73: 933-947).
III. Determination of the Structure of Ribosomes
Much of what is known about ribosome structure is derived from physical and chemical methods that produce relatively low-resolution information. Electron microscopy (EM) has contributed to an understanding of ribosome structure ever since the ribosome was discovered. In the 1970s, low resolution EM revealed the shape and quaternary organization of the ribosome. By the end of 1980s, the positions of the surface epitopes of all the proteins in the E. coli small subunit, as well as many in the large subunit, had been mapped using immunoelectron microscopy techniques (Oakes et al. (1986), Structure, Function and Genetics of Ribosomes, (Hardesty, B. and Kramer, G., eds.) Springer-Verlag, New York, N.Y., pp. 47-67; Stoeffler et al. (1986), Structure. Function and Genetics of Ribosomes, (Hardesty, B. and Kramer, G., eds.) Springer-Verlag, New York, N.Y., pp. 28-46). In the last few years, advances in single-particle cryo-EM and image reconstruction have led to three-dimensional reconstructions of the E. coli 70S ribosome and its complexes with tRNAs and elongation factors to resolutions of between 15 Å and 25 Å (Stark et al. (1995) Structure 3: 815-821; Stark et al. (1997) Nature 3898: 403-406; Agrawal et al. (1996) Science 271: 1000-1002; Stark et al. (1997) Cell 28: 19-28). Additionally, threedimensional EM images of the ribosome have been produced at resolutions sufficiently high so that many of the proteins and nucleic acids that assist in protein synthesis can be visualized bound to the ribosome. An approximate model of the RNA structure in the large subunit has been constructed to fit a 7.5 Å resolution electron microscopic map of the 50S subunit from E. coli and available biochemical data (Mueller et al. (2000) J. Mol. Biol. 298: 35-59).
While the insights provided by EM have been useful, it has long been recognized that a full understanding of ribosome structure would derive only from X-ray crystallography. In 1979, Yonath and Wittman obtained the first potentially useful crystals of ribosomes and ribosomal subunits (Yonath et al. (1980) Biochem. Internat. 1: 428-435). By the mid 1980s, scientists were preparing ribosome crystals for X-ray crystallography (Maskowski et al. (1987) J. Mol. Biol. 193: 818-822). The first crystals of 50S ribosomal subunit from H. marismortui were obtained in 1987. In 1991, improvements were reported in the resolution of the diffraction data obtainable from the crystals of the 50S ribosomal subunit of H. marismortui (van Bohlen, K. (1991) J Mol. Biol. 222: 11).
In 1995, low resolution electron density maps for the large and small ribosomal subunits from halophilic and thermophilic sources were reported (Schlunzen et al. (1995) Biochem. Cell Biol. 73: 739-749). However, these low resolution electron density maps proved to be spurious (Ban et al. (1998) Cell 93: 1105-1115).
The first electron density map of the ribosome that showed features recognizable as duplex RNA was a 9 Å resolution X-ray crystallographic map of the large subunit from Haloarcula marismortui (Ban et al. (1998) supra). Extension of the phasing of that map to 5 Å resolution made it possible to locate several proteins and nucleic acid sequences, the structures of which had been determined independently (Ban et al. (1999) Nature 400: 841-847).
At about the same time, using similar crystallographic strategies, a 7.8 Å resolution map was generated of the entire Thermus thermophilus ribosome showing the positions of tRNA molecules bound to its A-, P-, and E- (protein exit site) sites (Cate et al. (1999) Science 285: 2095-2104), and a 5.5 Å resolution map of the 30S subunit from T. thermophilus was obtained that allowed the fitting of solved protein structures and the interpretation of some of its RNA features (Clemons, Jr. et al. (1999) Nature 400: 833-840). Subsequently, a 4.5 Å resolution map of the T. thermophilus 30S subunit was published, which was based in part on phases calculated from a model corresponding to 28% of the subunit mass that had been obtained using a 6 Å resolution experimental map (Tocilj et al. (1999) Proc. Natl. Acad. Sci. USA 96: 14252-14257).
IV. Location of the Peptidyl Transferase Site in the Large Ribosomal Subunit
It has been known for about 35 years that the peptidyl transferase activity responsible for the peptide bond formation that occurs during messenger RNA-directed protein synthesis is intrinsic to the large ribosomal subunit (Traut et al. (1964)J. Mol. Biol. 10: 63; Rychlik (1966) Biochim. Biophys. Acta 114: 425; Monro (1967) J. Mol. Biol. 26: 147-15; Maden et al. (1968) J. Mol. Biol. 35: 333-345) and it has been understood for even longer that the ribosome contains proteins as well as RNA. In certain species of bacteria, for example, the large ribosomal subunit contains about 35 different proteins and two RNAs (Noller (1984) Ann. Rev. Biochem. 53: 119-162; Wittmann-Liebold et al. (1990) The Ribosome: Structure, Function, and Evolution, (W. E. Hill et al., eds.) American Society for Microbiology, Washington, D.C. (1990), pp. 598-616). These findings posed three related questions. Which of the almost 40 macromolecular components of the large ribosomal subunit contribute to its peptidyl transferase site, where is that site located in the large subunit, and bow does it work?
By 1980, the list of components that might be part of the ribosome's peptidyl transferase had been reduced to about half a dozen proteins and 23S rRNA (see Cooperman (1980) Ribosomes: Structure, Function, and Genetics, (G. Chambliss et al., eds.) University Park Press, Baltimore, Md. (1980), 531-554), and following the discovery of catalytic RNAs (Guerrier-Takada et al. (1983) Cell 35: 849-857; Kruger et al. (1982) Cell 31: 147-157), the hypothesis that 23S rRNA might be its sole constituent, which had been proposed years earlier, began to gain favor. In 1984, Noller and colleagues published affinity labeling results which showed that U2619 and U2620 (in E. coli: U2584, U2585) are adjacent to the CCA-end of P-site-bound tRNA (Barta et al. (1984) Proc. Nat. Acad. Sci. USA 81: 3607-3611; Vester et al. (1988) EMBO J. 7: 3577-3587). These nucleotides appear to be part of a highly conserved internal loop in the center of domain V of 23S rRNA. The hypothesis that this loop is intimately involved in the peptidyl transferase activity was supported by the observation that mutations in that loop render cells resistant to many inhibitors of peptidyl transferase, and evidence implicating it in this activity has continued to mount (see, Noller (1991) Ann. Rev. Biochem. 60: 191-227; Garrett et al. (1996) Ribosomal RNA: Structure, Evolution, Processing and Function in Protein Biosynthesis, (R. A. Zimmerman and A. E. Dahlberg, eds.) CRC Press, Boca Raton, Fla. (1996), pp. 327-355).
Definitive proof that the central loop in domain V is the sole component of the ribosome involved in the peptidyl transferase activity has remained elusive, however. Studies have shown that it was possible to prepare particles that retained peptidyl transferase activity by increasingly vigorous deproteinizations of large ribosomal subunits, however, it was not possible to produce active particles that were completely protein-free. Nevertheless, combined with earlier reconstitution results (Franceschi et al. (1990) J. Biol. Chem. 265: 6676-6682), this work reduced the number of proteins that might be involved to just two: L2 and L3 (see, Green et al. (1997) Annu. Rev. Biochem. 66: 679-716). More recently, Watanabe and coworkers reported success in eliciting peptidyl transferase activity from in vitro synthesized, protein-free 23S rRNA (Nitta et al. (1998) RNA 4: 257-267), however, their observations appear not to have withstood further scrutiny. Thus the question still remained: is the ribosome a zibozyme or is it not?
Over the years, the location of the peptidyl transferase site in the ribosome has been approached almost exclusively by electron microscopy. In the mid-1980s evidence that there is a tunnel running through the large ribosomal subunit from the middle of its subunit interface side to its back (Milligan et al (1986) Nature 319: 693-695; Yonath et al. (1987) Science 236: 813-816) began to accumulate, and there has been strong reason to believe that polypeptides pass through it as they are synthesized (Bernabeu et al. (1982) Proc. Nat. Acad. Sci. USA 79: 3111-3115; Ryabova et al. (1988) FEBS Letters 226: 255-260; Beckmann et al. (1997) Science 278: 2123-2126). More recent cryo-EM investigations (Frank et al (1995) Nature 376: 441-444; Frank et al. (1995) Biochem. Cell Biol. 73: 757-765; Stark et al. (1995) supra) confirmed the existence of the tunnel and demonstrated that the CCA-ends of ribosome-bound tRNAs bound to the A- and P-sites are found in the subunit interface end of the tunnel. Consequently, the peptidyl transferase site must be located at that same position, which is at the bottom of a deep cleft in the center of the subunit interface surface of the large subunit, immediately below its central protuberance.
The substrates of the reaction catalyzed at the peptidyl transferase site of the large subunit are an aminoacyl-tRNA (aa-tRNA) and a peptidyl-tRNA. The former binds in the ribosome's A-site and the latter in its P-site. The α-amino group of the aa-tRNA attacks the carbon of the carbonyl acylating the 3′ hydroxyl group of the peptidyl-tRNA, and a tetrahedral intermediate is formed at the carbonyl carbon. The tetrahedral intermediate resolves to yield a peptide extended by one amino acid esterified to the A-site bound tRNA and a deacylated tRNA in the P-site.
This reaction scheme is supported by the observations of Yarus and colleagues who synthesized an analogue of the tetrahedral intermediate by joining an oligonucleotide having the sequence CCdA to puromycin via a phosphoramide group (Welch et al. (1995) Biochemistry 34: 385-390). The sequence CCA, which is the 3′ terminal sequence of all tRNAs, binds to the large subunit by itself, consistent with the biochemical data showing that the interactions between tRNAs and the large subunit largely depend on their CCA sequences (Moazed et al. (1991) Proc. Natl. Acad. Sci. USA 88: 3725-3728). Puromycin is an aa-tRNA analogue that interacts with the ribosomal A-site, and the phosphoramide group of the compound mimics the tetrahedral carbon intermediate. This transition state analogue, CCdA-phosphate-puromycin (CCdA-p-Puro), binds tightly to the ribosome, and inhibits its peptidyl transferase activity (Welch et al. (1995) supra).
V. Structure Determination of Macromolecules Using X-Ray Crystallography
In order to better describe the efforts undertaken to determine the structure of ribosomes, a general overview of X-ray crystallography is provided below.
Each atom in a crystal scatters X-rays in all directions, but crystalline diffraction is observed only when a crystal is oriented relative to the X-ray beam so that the atomic scattering interferes constructively. The orientations that lead to diffraction may be computed if the wavelength of the X-rays used and the symmetry and dimensions of the crystal's unit cell are known (Blundell et al. (1976) Protein Crystallography (Molecular Biology Series), Academic Press, London). The result is that if a detector is placed behind a crystal that is being irradiated with monochromatic X-rays of an appropriate wavelength, the diffraction pattern recorded will consist of spots, each spot representing one of the orientations that gives rise to constructive interference.
Each spot in such a pattern, however it is recorded, is characterized by (i) an intensity (its blackness); (ii) a location, which encodes the information about diffraction orientation; and (iii) a phase. If all of those things are known about each spot in a crystal diffraction pattern, the distribution of electrons in the unit cell of the crystal may be computed by Fourier transformation (Blundell et al. (1976) supra), and from that distribution or electron density map, atomic positions can be determined.
Unfortunately, the phase information essential for computing electron distributions cannot be measured directly from diffraction patterns. One of the methods routinely used to determine the phases of macromolecules, such as proteins and nucleic acids, is called multiple isomorphous replacement (MIR) which involves the introduction of new X-ray scatterers into the unit cell of the crystal. Typically, these additions are heavy atoms, which make a significant contribution to the diffraction pattern. It is important that the additions be sufficiently low in number so that their positions can be located and that they leave the structure of the molecule or of the crystal cell unaltered, i.e. the crystals should be isomorphous. Isomorphous replacement usually is performed by diffusing different heavy-metal complexes into the channels of the preformed protein crystals. Macromolecules expose side chains (such as SH groups) in these solvent channels that are able to bind heavy metals. It is also possible to replace endogenous light metals in metalloproteins with heavier ones, e.g., zinc by mercury, or calcium by samarium. Alternatively, the isomorphous derivative can be obtained by covalently attaching a heavy metal to the macromolecule in solution and then subjecting it to crystallization conditions.
Heavy metal atoms routinely used for isomorphous replacement include but are not limited to mercury, uranium, platinum, gold, lead, and selenium. Specific examples include mercury chloride, ethyl-mercury phosphate, and osmium pentamine, iridium pentamine. Since such heavy metals contain many more electrons than the light atoms (H, N, C, O, and S) of the protein, the heavy metals scatter x-rays more strongly. All diffracted beams would therefore increase in intensity after heavy-metal substitution if all interference were positive. In fact, however, some interference is negative; consequently, following heavy-metal substitution, some spots increase in intensity, others decrease, and many show no detectable difference.
Phase differences between diffracted spots can be determined from intensity changes following heavy-metal substitution. First, the intensity differences are used to deduce the positions of the heavy atoms in the crystal unit cell. Fourier summations of these intensity differences give maps, of the vectors between the heavy atoms, the so-called Patterson maps. From these vector maps, the atomic arrangement of the heavy atoms is deduced. From the positions of the heavy metals in the unit cell, the amplitudes and phases of their contribution to the diffracted beams of protein crystals containing heavy metals is calculated.
This knowledge then is used to find the phase of the contribution from the protein in the absence of the heavy-metal atoms. As both the phase and amplitude of the heavy metals and the amplitude of the protein alone is known, as well as the amplitude of the protein plus heavy metals (i.e., protein heavy-metal complex), one phase and three amplitudes are known. From this, the interference of the X-rays scattered by the heavy metals and protein can be calculated to determine if the interference is constructive or destructive. The extent of positive or negative interference, with knowledge of the phase of the heavy metal, give an estimate of the phase of the protein. Because two different phase angles are determined and are equally good solutions, a second heavy-metal complex can be used which also gives two possible phase angles. Only one of these will have the same value as one of the two previous phase angles; it therefore represents the correct phase angle. In practice, more than two different heavy-metal complexes are usually made in order to give a reasonably good estimate of the phase for all reflections. Each individual phase estimate contains experimental errors arising from errors in the measured amplitudes. Furthermore, for many reflections, the intensity differences are too small to measure after one particular isomorphous replacement, and others can be tried.
The amplitudes and the phases of the diffraction data from the protein crystals are used to calculate an electron-density map of the repeating unit of the crystal. This map then is interpreted to accommodate the residues of the molecule of interest. That interpretation is made more complex by several limitations in the data. First, the map itself contains errors, mainly due to errors in the phase angles. In addition, the quality of the map depends on the resolution of the diffraction data, which, in turn, depends on how well-ordered the crystals are. This directly influences the quality of the map that can be produced. The resolution is measured in angstrom units (Å); the smaller this number is, the higher the resolution and, therefore, the greater the amount of detail that can be seen.
Building the initial model is a trial-and-error process. First, one has to decide how a polypeptide chain or nucleic acid weaves its way through the electron-density map. The resulting chain trace constitutes a hypothesis by which one tries to match the density of side chains to the known sequence of the polypeptide or nucleic acid. When a reasonable chain trace has finally been obtained, an initial model is built that fits the atoms of the molecule into the electron density. Computer graphics are used both for chain tracing and for model building to present the data and manipulated the models.
The initial model will contain some errors. Provided the crystals diffract to high enough resolution (e.g., better than 3.5 Å), most or substantially all of the errors can be removed by crystallographic refinement of the model using computer algorithms. In this process, the model is changed to minimize the difference between the experimentally observed diffraction amplitudes and those calculated for a hypothetical crystal containing the model (instead of the real molecule). This difference is expressed as an R factor (residual disagreement) which is 0.0 for exact agreement and about 0.59 for total disagreement.
In general, the R factor for a well-determined macromolecular structure preferably lies between 0.15 and 0.35 (such as less than about 0.24-0.28). The residual difference is a consequence of errors and imperfections in the data. These derive from various sources, including slight variations in the conformation of the protein molecules, as well as inaccurate corrections both for the presence of solvent and for differences in the orientation of the microcrystals from which the crystal is built. This means that the final model represents an average of molecules that are slightly different both in conformation and orientation.
In refined structures at high resolution, there are usually no major errors in the orientation of individual residues, and the estimated errors in atomic positions are usually around 0.1-0.2 Å, provided the sequence of the protein or nucleic acid is known. Hydrogen bonds, both within the molecule of interest and to bound ligands, can be identified with a high degree of confidence.
Typically, X-ray structures can be determined provided the resolution is better than 3.5 Å. Electron-density maps are interpreted by fitting the known amino acid and/or nucleic acid sequences into regions of electron density.
VI. The Need for Higher Resolution for the 50S Ribosomal Subunit
Although the art provides crystals of the 50S ribosomal subunit, and 9 Å and 5 Å resolution X-ray crystallographic maps of the structure of the 50S ribosome, the prior art crystals and X-ray diffraction data are not sufficient to establish the three-dimensional structures of all 31 proteins and 3,043 nucleotides of the 50S ribosomal subunit. Thus, the prior art crystals and maps are inadequate for the structure-based design of active agents, such as herbicides, drugs, insecticides, and animal poisons.
More detailed, higher resolution X-ray crystallographic maps are necessary in order to determine the location and three-dimensional structure of the proteins and nucleotides in ribosomes and ribosomal subunits, particularly for the 50S ribosomal subunit. An accurate molecular structure of the 50S ribosomal subunit will not only enable further investigation and understanding of the mechanism of protein synthesis, but also the development of effective therapeutic agents and drugs that modulate (e.g., induce or inhibit) protein synthesis.