Retrovirus assembly, a key step in the viral replication cycle, involves a process in which a large number of chemically distinct macromolecules are transported through different pathways to a single point at the plasma membrane of the cell where they are assembled into a nascent viral particle. The internal protein shell or capsid of the virus is assembled from a large number of polyprotein precursors that must be transported through the cytoplasm, either preassembled, in small groups, or as monomers to the underside of the plasma membrane. The membrane-spanning viral glycoproteins, on the other hand, must be transported through the secretory pathway of the cell to the plasma membrane where they co-localize with the nascent, membrane-extruding capsid. At a point still undetermined in the capsid assembly process, genome-length viral RNA molecules, along with necessary smaller cell-derived RNAs, must become associated with both capsid and polymerase components. Thus interactions between viral proteins themselves, between proteins of viral and cell origin, as well as those between viral proteins, nucleic acids, and lipids are at the heart of the assembly process.
All replication competent retroviruses contain four genes that encode the structural and enzymatic components of the virion. These are gag (capsid protein), pro (aspartyl proteinase), pol (reverse transcriptase and integrase enzymes) and env (envelope glycoprotein). Unlike most other enveloped RNA viruses, in which the viral glycoproteins appear to catalyze virus particle formation, assembly and release of retrovirus particles occurs when capsid proteins are produced in the absence of the other gene products. Several studies have shown that expression of the gag gene alone in a number of systems results in the efficient assembly and release of membrane enveloped virions (Craven, R. C., et al. (1996). Dynamic interactions of the Gag polyprotein. Current Topics in Microbiology and Immunology 214, pp. 65–94; Delchambre, M., et al. (1989). The Gag precursors of simian immunodeficiency virus assembles into virus-like particles. EMBO 8, pp. 2653–60; Dickson, C., et al. (1984). “Protein biosynthesis and assembly,” RNA tumor viruses (R. Weiss, N. Teich, H. Varmus, and J. Coffin, Eds.), Vol. 1, pp. 513–648. 2 vols. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.; Gheysen, H. P., et al. (1989), “Assembly and release of HIV-1 precursor Pr55 gag virus-like particles from recombinant baculovirus-infected insect cells,” Cell 59, pp. 103–12; Haffar, O., et al. (1990), “Human immunodeficiency virus-like, non-replication, Gag-Env particles assemble in a recombinant vaccinia virus expression system,” J. Virol. 64, pp. 2653–59; Hunter, E. (1994), “Macromolecular interactions in the assembly of HIV and other retroviruses,” Sem. in Virology 5, pp. 71–83; Kraiusslich, H.-G., et al. (1996), “Intracellular transport of retroviral capsid components,” Current Topics in Microbiology and Immunology 214, pp. 25–64; Madisen, L., et al. (1987), “Expression of the human immunodeficiency virus gag gene in insect cells,” Virology 158, pp. 248–250; Smith, A. J., et al. (1990), “Human immunodeficiency virus type 1 Pr55 gag and Pr160 gag-pol expressed from a simian virus 40 late-replacement vector are efficiently processed and assembled into virus-like particles,” J. Virol. 64, pp. 2743–50; Sommerfelt, M. A., et al. (1992), “Importance of the p12 protein in Mason-Pfizer monkey virus assembly and infectivity,” J. Virol. 66, pp. 7005–11; Wills, J. W., et al. (1989), “Creation and expression of myristylated forms of Rous sarcoma virus Gag protein in mammalian cells,” J. Virol. 63, pp. 4331–43). Thus, the product of this gene has the necessary structural information to mediate intracellular transport, to direct assembly into the capsid shell, and to catalyze the process of membrane extrusion known as budding.
The gag gene product, a polyprotein precursor, is translated on free polyribosomes from an unspliced, genome length mRNA (Eisenman, R. N., et al. (1974), “Synthesis of avian RNA tumor virus structural proteins,” Cold Spring Harbor Symp. Quant. Biol. 39, pp. 1067–1075). Such precursors will generally follow one of two pathways during the process of viral morphogenesis (Gelderblom, H. (1990), “Morphogenesis, maturation, and fine structure of lentiviruses,” Retroviral Proteases: Control of Maturation and Morphogenesis (L. H. Pearl, Ed.), pp. 159–80. Stockton Press, New York, N.Y.). In most retroviruses, the nascent Gag polyproteins are transported directly to the plasma membrane where assembly of the immature capsid shell and membrane extrusion occur simultaneously. Viruses that undergo this form of morphogenesis are known as type-C viruses and include the avian and mammalian leukemia/sarcoma viruses (e.g., Rous sarcoma, avian leukosis and murine leukemia virus) (Teich, N. (1982), “Taxonomy of retroviruses,” 2nd ed., RNA tumor viruses (A. Weiss, N. Teich, H. E. Varmus, and J. M. Coffin, Eds.), pp. 25–207, Cold Spring Harbor Laboratory, New York). The pathogenic human viruses, human T-cell leukemia virus and human immunodeficiency virus (HTLV-I and HIV), assemble their capsids in a similar fashion. In the second morphogenic class of viruses, the Gag precursors appear to be targeted to an intracytoplasmic site where immature capsid assembly occurs (Rhee, S. S., et al. (1990), “A single amino acid substitution within the matrix protein of a type D retrovirus converts its morphogenesis to that of a type C retrovirus,” Cell 63, pp. 77–86; Rhee, S. S., etal. (1991), “Amino acid substitutions within the matrix protein of type D retroviruses affect assembly, transport and membrane association of a capsid,” EMBO J. 10, pp. 535–46). These preassembled immature capsids are then transported to the plasma membrane where they undergo budding and envelopment. Viruses that undergo this process of assembly and release include the type-B, mouse mammary tumor virus (MMTV), the type-D, Mason-Pfizer monkey virus (M-PMV) and related simian retroviruses (SRV1–5), as well as members of the spumavirus family (Gelderblom, H. (1990), “Morphogenesis, maturation, and fine structure of lentiviruses,” Retroviral Proteases: Control of Maturation and Morphogenesis (L. H. Pearl, Ed.), pp. 159–80, Stockton Press, New York, N.Y.; Teich, N. (1982), “Taxonomy of retroviruses,” 2nd ed., RNA tumor viruses (A. Weiss, N. Teich, H. E. Varmus, and J. M. Coffin, Eds.), pp. 25–207. Cold Spring Harbor Laboratory, New York). Despite the different morphogenic pathways, the process by which Gag precursors assemble into immature capsids is probably similar for the type-C and type-B/D viruses, since a single amino acid change within the gag gene product of M-PMV can divert Gag to the type-C morphogenic pathway (Rhee, S. S., et al. (1990), “A single amino acid substitution within the matrix protein of a type D retrovirus converts its morphogenesis to that of a type C retrovirus,” Cell 63, pp. 77–86). Irrespective of the pathway to virus release, the newly budded virions have a common immature morphology. In thin-section electron microscopy, the immature capsid shell appears as an electron opaque band in tight apposition to the membrane with an electron lucent center. During the process of virus maturation, the capsid polyprotein precursors are cleaved by the virus-encoded aspartyl proteinase, which leads to collapse of the structure into an electron dense core with a morphology characteristic of the virus family (Gelderblom, H. R. (1991), “Assembly and morphology of HIV: potential effect of structure on viral function,” AIDS 5, pp. 617–38; Nermut, M. V., et al. (1996), “Comparative morphology and structural classification of retroviruses,” Current Topics in Microbiology and Immunology 214, pp. 1–24).
The Gag polyprotein precursor functions as the primary building block in virus capsid assembly that is cleaved during maturation by the viral proteinase to yield a number of individual proteins that make up the mature virion. The translation of these products as a precursor protein thus ensures that equimolar amounts of each of the structural proteins are incorporated into the virus. While the size and protein content of the precursor varies between different retroviral families, at least three gag -encoded proteins are found in all retroviruses; these are the matrix protein (MA), the capsid protein (CA), and the nucleocapsid protein (NC) (Leis, J., et al. (1988), “Standardized and simplified nomenclature for proteins common to all retroviruses,” J. Virol. 62, pp. 1808–9). The matrix protein is closely associated with the viral membrane with which it can be chemically cross-linked (Gebhardt, A., et al. (1984), “Rous sarcoma virus p19 and gp35 can be chemically crosslinked to high molecular weight complexes: An insight into virus assembly,” J. Mol. Biol. 174, pp. 297–317; Gelderblom, H. R., et al. (1987), “Fine structure of human immunodeficiency virus (HIV) and immunolocalization of structural proteins,” Virology 156, pp. 171–176; Pepinsky, R. B., et al. (1979), “Identification of retrovirus matrix proteins by lipid-protein crosslinking,” J. Mol. Biol. 131, pp. 819–837). This amino-terminal domain of the Gag precursor plays a major role in directing the protein to the site of assembly and may be important for the process of membrane extrusion itself, since the MA domain of simian immunodeficiency virus (SIV) expressed in the absence of other Gag domains can direct the budding process (Gonzalez, S. A., et al. (1993), “Assembly of the matrix protein of simian immunodeficiency virus into virus-like particles,” Virology 194, pp. 548–56). As in most retroviruses, the HIV MA is modified co-translationally by the N-terminal addition of a myristic acid residue that is critical for its function (Bryant, M., et al. (1990), “Myristoylation-dependent replication and assembly of human immunodeficiency virus 1,” Proc. Natl. Acad. Sci. USA 87, pp. 523–527; Göttlinger, H.G., et al. (1989), “Role of capsid precursor processing and myristoylation in morphogenesis and infectivity of human immunodeficiency virus type I,” Proc. Natl. Acad. Sci. USA 86, pp. 5781–85). The crystal structure of MA from both HIV and SIV has been determined. Individual MA molecules are composed of 5 major helices capped by a three-stranded P-sheet. The protein assembles into trimers that could create a large, bipartite membrane-binding surface in which exposed basic residues, together with the myristyl moiety, could anchor the protein on the acidic inner side of the viral membrane (Hill, C.P., et al. (1996), “Crystal structures of the trimeric human immunodeficiency virus type 1 matrix protein: implications for membrane association and assembly,” Proc. Natl. Acad. Sci. U.S.A. 93, pp. 3099–104; Rao, Z., et al. (1995), “Crystal structure of SIV matrix antigen and implications for virus assembly,” Nature 378, pp. 743–7).
The CA protein forms the major protein component of the electron dense core in mature virions, where it appears to form a protein shell into which the virion RNA genome and replicative enzymes are packed (Bolognesi, D. P., et al. (1973), “Localization of RNA tumor virus polypeptides. I. isolation of further virus substrates,” Virology 56, pp. 549–64; Gelderblom, H. R. (1991), “Assembly and morphology of HIV: potential effect of structure on viral function,” AIDS 5, pp. 617–38; Gelderblom, H. R., et al. (1987), “Fine structure of human immunodeficiency virus (HIV) and immunolocalization of structural proteins,” Virology 156, pp. 171–176). The structure of the N-terminal domain of CA, determined recently by NMR and crystallography, is unlike those of most previously characterized viral coat proteins in that it is predominantly helical—each monomer within the crystallized dimer consists of seven alpha-helices, five of which are arranged in a coil-like structure. The domain is shaped like an arrowhead, with two beta hairpins and a surface-loop exposed at the trailing edge, and the carboxyl-terminal helix projecting from the tip (Gitti, R. K., et al. (1996), “Structure of the amino-terminal core domain of the HIV-1 capsid protein,” Science 273, pp. 231–5; Momany, C., et al. (1996), “Crystal structure of dimeric HIV-1 capsid protein,” Nat Struct Biol 3, pp. 763–70). The core protein of the hepatitis B virus is also composed of helices, two of which contribute to an intermolecular 4-helix bundle to form a dimer (Böttcher, B., et al. (1997), “Determination of the fold of the core protein of hepatitis B virus by electron cryomicroscopy,” Nature 386, pp. 88–91; Conway, J. F., et al. (1997), “Visualization of a 4-helix bundle in the hepatitis B virus capsid by cryo-electron microscopy,” Nature 386, pp. 91–94).
The NC protein is located within the CA-derived shell where it is found associated with the viral RNA genome (Linial, M. L., et al. (1990), “Retroviral RNA packaging: Sequence requirements and implications,” Curr. Top. Microbiol. and Immunol. 157, pp. 125–152). This domain of the Gag precursor, in all retroviruses but the spumaviridae, contains a conserved cysteine-histidine rich, zinc finger-like region (Cys-X2-Cys-X4-His-X4-Cys) that is thought to play an important role in the specific packaging of viral RNA into the assembling virus (Berkowitz, R., et al. (1996), “RNA packaging,” Current Topics in Microbiology and Immunology 214, pp. 177–218; Berkowitz, R. D., et al. (1993), “Specific binding of human immunodeficiency virus type 1 gag polyprotein and nucleocapsid protein to viral RNAs detected by RNA mobility shift assays,” J. Virol. 67, pp. 7190–7200; Gorelick, R., et al. (1990), “Non-infectious human immunodeficiency virus type 1 mutants deficient in genomic RNA,” J. Virol. 64, pp. 3207–11; Gorelick, R. J., et al. (1988), “Point mutants of Moloney murine leukemia virus that fail to package viral RNA: Evidence for specific RNA recognition by a “zinc-finger-like” protein sequence,” Proc. Natl. Acad. Sci. USA 85, pp. 8420–24; Katz, R. A., et al. (1989), “What is the role of the cys-his motif in retroviral nucleocapsid (NC) proteins?”, BioEssays 11, pp. 176–81; Meric, C., et al. (1989), “Characterization of Moloney murine leukemia virus mutants with single amino acid substitutions in the cys-his box of the nucleocapsid protein,” J. Virol. 63, pp. 1558–68; Meric, C., et al. (1988), “Mutations in Rous sarcoma virus nucleocapsid protein p12 (NC): deletions of Cys-His boxes,” J. Virol. 62, pp. 3328–33; Sakalian, M., et al. (1994), “Efficiency and selectivity of RNA packaging by Rous sarcoma virus Gag deletion mutants,” J. Virol. 68, pp. 5969–81).
The arrangement of the proteins on the precursor (NH2-MA- CA- NC—COOH) reflects their position in the virion, where they appear to form concentric shells of protein after cleavage from the precursor. This interpretation is supported by immuno-electron microscopy, detergent fractionation studies, and chemical cross-linking analyses (Gelderblom, H. R., et al. (1987), “Fine structure of human immunodeficiency virus (HIV) and immunolocalization of structural proteins,” Virology 156, pp. 171–176; Pepinsky, R. B., et al. (1980), “Chemical cross-linking of proteins in avian sarcoma and leukemia viruses,” Virology 102, pp. 205–10; Stromberg, K., et al. (1974), “Structural studies of avian myeloblastosis virus: comparison of polypeptides in virion and core component by dodecyl sulfate-polyacrylamide gel electrophoresis,” J. Virol. 13, pp. 513–28). Despite a common organization, Gag precursors from different retroviruses share little amino acid sequence homology except for a conserved region of approximately 20 amino acids in CA, termed the major homology region (MHR, Wills, J. W., et al. (1991), “Form, function, and use of retroviral Gag proteins,” AIDS 5, pp. 639–54), and the conserved cysteine-histidine motifs in the NC domain. Functional homologies must thus be reflected at the level of three-dimensional structure, as has been observed between retroviral proteinases (Weber, I. (1990), “Comparison of the crystal structures and intersubunit interactions of human immunodeficiency and Rous sarcoma virus proteases,” J. Biol. Chem. 265, pp. 10492–96) and among the MA protein structures of HIV, SIV, BLV (Matthews, S., et al. (1996), “The solution structure of the bovine leukaemia virus matrix protein and similarity with lentiviral matrix proteins,” Embo J 15, pp. 3267–74), and M-PMV (Conte, M. R., et al. (1997), “The three-dimensional solution structure of the matrix protein from the type D retrovirus, the Mason-Pfizer Monkey virus,” submitted), which share little sequence homology but which maintain very similar three-dimensional structures.
Production of a nascent particle with a defined size, density, and morphology requires that Gag proteins 1) find each other, 2) interact in a regular and stable manner to form the spherical, immature capsid, 3) associate with the plasma membrane, and 4) drive the budding process. The amino acid sequences of Gag that are involved in these processes, as well as those which might have other functions in the virus replication cycle, are being ascertained through mutational analyses. This approach, which has been explored in a variety of retroviruses, is reviewed in detail by Craven and Parent (Craven, R. C., et al. (1996), “Dynamic interactions of the Gag polyprotein,” Current Topics in Microbiology and Immunology 214, pp. 65–94). It is important to keep in mind that the assembly domains within the Gag precursor may not necessarily reside within the boundaries of the mature cleavage products of Gag but may span the cleavage sites. Thus, PR-mediated processing of the Gag precursor destroys these assembly functions and defines the transition from an assembly function of Gag to an entry/infection one where there is a requirement for efficient disassembly and release of a transcriptionally active core upon infection of a new cell.
Evidence for the existence of assembly domains within Gag proteins has been obtained by mutational analysis. Of the different Gag proteins that have been examined with regard to the specific amino acids involved in particle formation, the RSV Gag protein is by far the best defined. This type of analysis has yielded striking results where several assembly domains, comprising less than 30% of the total Gag precursor have been defined and partially characterized.
All Gag proteins appear to require their amino termini for membrane association. In RSV, the amino-terminal assembly domain (M) appears to include the first half of the MA domain, since small deletions in this region destroy capsid assembly, and budding and the precursors fail to localize at the plasma membrane. These results are similar to those from studies with mammalian retroviruses in which myristylation has been blocked. The membrane binding domain of Gag proteins from other retroviruses are also contained in their amino terminal sequences (Bennett, R. P., et al. (1993), “Functional chimeras of the Rous sarcoma virus and human immunodeficiency virus gag proteins,” J. Virol. 67, pp. 6487–98; Rhee, S. S., et al. (1987), “Myristylation is required for intracellular transport but not for assembly of D-type retrovirus capsids,” J. Virol. 61, pp. 1045–53; Rhee, S. S., et al. (1991), “Amino acid substitutions within the matrix protein of type D retroviruses affect assembly, transport and membrane association of a capsid,” EMBO J. 10, pp. 535–46; Spearman, P., et al. (1994), “Identification of human immunodeficiency virus type 1 Gag protein domains essential to membrane binding and particle assembly,” J. Virol. 68, pp. 3232–42; Yu, X., et al. (1992), “The matrix protein of human immunodeficiency virus type 1 is required for incorporation of viral envelope protein into mature virions,” J. Virol. 66, pp. 4966–71; Zhou, W., et al. (1994), “Identification of a membrane-binding domain within the amino-terminal region of human immunodeficiency virus type 1 Gag protein which interacts with acidic phospholipids,” J. Virol. 68, pp. 2556–69). Recent NMR and crystallographic studies of bacterially expressed HIV MA protein (p17) have provided insights into the three dimensional structure of this normally membrane-associated molecule (Conte, M. R., et al. (1997), “The three-dimensional solution structure of the matrix protein from the type D retrovirus, the Mason-Pfizer Monkey virus,” submitted; Hill, C. P., et al. (1996), “Crystal structures of the trimeric human immunodeficiency virus type 1 matrix protein: implications for membrane association and assembly,” Proc. Natl. Acad. Sci. U.S.A. 93, pp. 3099–104; Matthews, S., et al. (1994), “Structural similarity between the p 17 matrix protein of HIV-1 and interferon-Gamma,” Nature 370, pp. 666–8; Rao, Z., et al. (1995), “Crystal structure of SIV matrix antigen and implications for virus assembly,” Nature 378, pp. 743–7). Although predominantly helical, a prominent feature of p17MA is an irregular β-sheet, the solvent-exposed side of which provides a surface that could associate with the inner face of the membrane, since several basic side chains (K8, R20, R22, K26–28, K30, K32, K95) are available for inter with phospholipid head groups (Matthews, S., et al. (1994), “Structural similarity between the p17 matrix protein of HIV-1 and interferon-Gamma,” Nature 370, pp. 666–8). Indeed, mutations which alter the charge distribution in this region have significant effects on virus assembly (Gonzalez, S. A., et al. (1993), “Assembly of the matrix protein of simian immunodeficiency virus into virus-like particles,” Virology 194, pp. 548–56; Yuan, X., et al. (1993), “Mutations in the N-terminal region of human immunodeficiency virus type 1 matrix protein block intracellular transport of the Gag precursor,” J. Virol. 67, pp. 6387–94; Zhou, W., et al. (1994), “Identification of a membrane-binding domain within the amino-terminal region of human immunodeficiency virus type 1 Gag protein which interacts with acidic phospholipids,” J. Virol. 68, pp. 2556–69).
A second assembly domain (L) has been identified for RSV that appears to mediate a late stage in the budding process. This domain includes a PPPY (WW-binding) motif that is physically located within the carboxy-terminus of the “spacer peptide” p2 (Garnier, L., et al. (1996), “WW domains and retrovirus budding,” Nature 381, pp. 744–745). Mutations within this region appear to block the final stages of budding (Wills, J. W., et al. (1994), “An assembly domain of the Rous sarcoma virus Gag protein required late in budding,” J Virol 68, pp. 6605–18). A similar motif is found within the pp16 region of M-PMV where mutagenesis studies yielded a similar phenotype (Yasuda, J., et al. (1997), “A proline-rich motif (PPPY) in the Gag polyprotein of Mason-Pfizer monkey virus plays a maturation-independent role in virion release,” J. Virol., Submitted for publication). In HIV, the carboxy-terminal peptide sequence, p6, appears to play an analogous role. Truncations or deletions of this domain result in the accumulation of immature particles still attached to the plasma membrane by a thin stalk (Göttlinger, H. G., et al. (1991), “Effect of mutations affecting the p6 gag protein on human immunodeficiency virus particle release,” Proc. Natl. Acad. Sci. U.S.A. 88, pp. 3195–99). Curiously, the L domain may be moved in position within the Gag precursor molecule and still function, and domains from one retrovirus may function in another (Parent, L. J., et al. (1995), “Positionally independent and exchangeable late budding functions of the Rous sarcoma virus and human immunodeficiency virus Gag proteins,” Journal of Virology 69, pp. 5455–60).
For those Gag proteins that have been examined in detail, there appears to be a specific domain that is essential for the production of particles with the correct density and size. In RSV, this domain (I) spans the carboxy-terminal end of the CA domain and half of the NC domain, and is essential for the production of particles with the correct density (Weldon, R. A., Jr., et al. (1993), “Characterization of a small (25 kDa) derivative if the Rous sarcoma virus Gag protein competent for particle release,” J. Virol., In Press). HIV and MuLV Gag proteins also require analogous regions for the production of particles with the correct density (Jones, T. A., et al. (1990), “Assembly of gag -β-galactosidase proteins into retrovirus particles,” J. Virol. 64, pp. 2229–65; Jowett, J. B. M., et al. (1992), “Distinct signals in human immunodeficiency virus type 1 Pr55 necessary for RNA binding and particle formation,” J. Gen. Virol. 73, pp. 3079–86). Furthermore, addition of this domain from HIV to a mutant RSV Gag protein that assembles into low-density particles can restore dense particle formation (Bennett, R. P., et al. (1993), “Functional chimeras of the Rous sarcoma virus and human immunodeficiency virus gag proteins,” J. Virol. 67, pp. 6487–98). Although the mechanism by which this domain influences particle density is not known, it could establish the correct protein-protein interactions that allow the tight packing of Gag molecules during particle formation. Alternatively, since this region contains sequences implicated in RNA packaging, this domain may influence particle density by directly mediating RNA encapsidation (Weldon, R. A., Jr., et al. (1993), “Characterization of a small (25 kDa) derivative if the Rous sarcoma virus Gag protein competent for particle release,” J. Virol., In Press). Thus, RNA could serve as a necessary scaffold upon which Gag proteins tightly pack during particle assembly.
Finally, there appears to be a region in Gag that influences particle size. In RSV, this region is located within the p10 and amino-terminal two-thirds of the CA domains. Mutants lacking this region can assemble into particles of the correct density, but these particles are heterogeneous in size (Weldon, R. A., Jr., et al. (1993), “Characterization of a small (25 kDa) derivative if the Rous sarcoma virus Gag protein competent for particle release,” J. Virol., In Press). Similarly several mutations in the highly- conserved MHR region of M-PMV yield particles of aberrant size (Strambio-de-Castillia, C., et al. (1992), “Mutational analysis of the major homology region of Mason-Pfizer monkey virus by use of saturation mutagenesis,” J. Virol. 66, pp. 7021–32). Thus, if Gag proteins fold into rod-like or cone-shaped structures (Nermut, M. V., et al. (1996), “Comparative morphology and structural classification of retroviruses,” Current Topics in Microbiology and Immunology 214, pp. 1–24; Nermut, M. V., et al. (1994), “Fullerene-like organization of HIV gag -protein shell in virus-like particles produced by recombinant baculovirus,” Virology 198, pp. 288–96) and interact with one another through amino and carboxy-terminal sequences, then this region may act as a spacer that establishes the curvature of the assembling capsid and thus influences the size (or shape) of the capsid.
Expression of the M-PMV gag gene in bacteria results in the rapid formation of inclusion bodies that, in thin section electron microscopy, contain assembled capsid structures that are indistinguishable from capsids assembled in HeLa cells. These results indicate that in vivo the environment of the bacterial cytoplasm is permissive for capsid assembly. Following purification of the inclusion bodies and solubilization in 8M urea, the soluble Gag precursors can, following removal of the denaturant, assemble in vitro into immature capsid-like structures. Negative-stain electron microscopy following sucrose gradient sedimentation showed large numbers of uniform-sized capsids (Klikova, M., et al. (1995), “Efficient in vivo and in vitro assembly of retroviral capsids from Gag precursor proteins expressed in bacteria,” J. Virol. 69, pp. 1093–98). Similarly, Campbell and Vogt (Campbell, S., et al. (1995), “Self-assembly in vitro of purified CA-NC proteins from Rous sarcoma virus and human immunodeficiency virus type 1,” J. Virol 69, pp. 6487–97) expressed a CA-NC fragment of the RSV and HIV Gag precursors in E. coli. These proteins were purified in native form and, after adjustment of the pH and salt concentration, each was found to assemble at a low level of efficiency into structures that resembled circular sheets and roughly spherical particles. The presence of RNA dramatically increased the efficiency of assembly, and, in this case, the proteins formed hollow, cylindrical particles whose lengths were determined by the size of the RNA. It is possible that this latter assembly process might mimic the interactions that occur during maturation of the virus particle where the NC and CA proteins condense around the viral genome. More recent experiments by Campbell and Vogt, with more complete portions of the RSV Gag precursor, have demonstrated the assembly of spherical immature-like particles when the protein was combined with RNA under the proper conditions (Campbell, S., et al. (1997), “In vitro assembly of virus-like particles with Rous sarcoma virus gag deletion mutants: Identification of the p10 domain as a morphological determinant in the formation of spherical particles,” Journal of Virology 71, pp. 4425–4435). This study also identified the p 10 region of Gag as the determinant for spherical particle formation and, thus, is consistent with previous results that indicated this region might act as a spacer to control the size of the assembling capsid.
Essentially all biochemical processes are initiated or maintained through highly specific and selective molecular interactions. Receptor molecules in cell membranes, antibodies, enzymes, and other macromolecules with a polypeptide character are capable of interacting with defined, specific peptide or nonpeptide structures on the basis of their binding sites. If one of the interactive sites is determined by a sequence of the peptide, it is possible to identify this site in a relatively straightforward way through the application of peptide libraries (Blake, J., et al. (1996), “Use of combinatorial peptide libraries to construct functional mimics of tumor epitopes recognized by MHC class I-restricted cytolytic T lymphocytes,” J Exp Med 184, pp. 121–30; Houghten, R. A. (1993), “The broad utility of soluble peptide libraries for drug discovery,” Gene 137, pp. 7–11; Houghten, R. A., et al. (1991), “Generation and use of synthetic peptide combinatorial libraries for basic research and drug discovery,” Nature 354, pp. 84–6; Lam, K. S., et al. (1991), “A new type of synthetic peptide library for identifying ligand-binding activity” [published errata appear in Nature 1992 Jul. 30; 358(6385): 434 and 1992 Dec. 24–31 ;360(6406): 768], Nature 354, pp. 82–4; Scott, J. K., et al. (1994), “Random peptide libraries,” Curr Opin Biotechnol 5, pp. 40–8). As Houghten (Houghten, R. A. (1994), “Combinatorial libraries. Finding the needle in the haystack,” Curr Biol 4, pp. 564–7) has pointed out, the construction of libraries consisting of millions of compounds provides a fundamental, practical advance in the study of the molecular interactions of pharmacologically relevant biochemical targets. Such libraries have been utilized in the study of antibody-antigen interactions, in the development of enzyme inhibitors and novel anti-microbial drugs, in the identification of biologically active peptides, and in the engineering of novel properties into antibodies. The use of peptide library technologies, followed by synthetic methodologies directed towards optimization, is a key route to obtaining peptides of desirable binding and stability properties. It facilitates the identification of small molecules that bind with high affinity to acceptor molecules and so mimic or block their interactions with the natural ligands.
The principle of libraries enables one to find, in a rapid, effective way, those particular molecules or structures that influence a particular biological system by testing a very large collection (106–109) of chemical structures simultaneously. Library-based methods that have been used so far fall into three broad categories, differing in the way in which the compounds making up the library have been synthesized and/or presented (Houghten, R. A. (1994), “Combinatorial libraries. Finding the needle in the haystack,” Curr Biol 4, pp. 564–7). The first category includes so-called fusion-protein-displayed peptide libraries, in which random peptides or proteins are expressed on the surface of filamentous phage particles, or on proteins expressed from plasmids (Scott, J. K., et al. (1994), “Random peptide libraries,” Curr Opin Biotechnol 5, pp. 40–8; Smith, G. P., et al. (1993), “Libraries of peptides and proteins displayed on filamentous phage,” Methods Enzymol 217, pp. 228–57). This approach centers on the expression of a number of copies (from a few to thousands) of the same peptide sequence on the surface of the phage. A library is produced by preparing millions of oligonucleotides and inserting these random sequences into the gene encoding the phage coat protein. Those peptide-expressing phage particles that bind to the purified and immobilized target of interest can be enriched in a selection process referred to as “biopanning”. After selection, the specific peptide sequence associated with the selected phage is determined by sequencing. The advantage of the above approach is that it involves widely available molecular biological techniques and can generate longer peptide or protein sequences than can be easily produced by chemical syntheses. The disadvantage is the restriction of peptide sequences to those containing the 20 genetically encoded amino acids as the building blocks of the library.
This fusion protein approach has also been adapted into the two-hybrid system for the identification of protein-protein interaction partners. This method originally developed by Fields and coworkers (Fields, S., et al. (1989), “A novel genetic system to detect protein-protein interactions,” Nature 340, pp. 245–6; Fields, S., et al. (1994), “The two-hybrid system: an assay for protein-protein interactions,” Trends Genet 10, pp. 286–92) is a yeast-based genetic assay to detect protein-protein interactions in vivo. The two-hybrid method is based on the restoration of transcriptional activation by the GAL4 protein. The GAL4 protein has two functions that are independent and physically separable in the linear sequence of the protein. One function is the specific binding of the protein to upstream activation sequences and the other is transcriptional activation; transcriptional activation of genes under GAL4 control requires that the GAL4 domains exhibiting these two functions be brought into spatial proximity. In the two-hybrid system, a strain of Saccharomyces cerivisiae with an integrated copy of the GAL1-lacZ fusion gene provides readout of GAL4 activity. This host is transformed with two plasmids encoding GAL4 fusion proteins: one plasmid encodes a fusion protein of the GAL4 DNA-binding domain and protein X, while the other encodes a fusion protein of the GAL4 activating domain and protein Y. If proteins X and Y interact, the GAL4 activation and DNA-binding regions are brought together, activating expression from the GAL1-lacZ fusion gene. Functional fusion proteins can be produced regardless of whether candidate domains are fused to the N or C terminus of the GAL4 fragment (Chien, C. T., et al. (1991), “The two-hybrid system: a method to identify and clone genes for proteins that interact with a protein of interest,” Proc Natl Acad Sci USA 88, pp. 9578–82).
The above “forward” system for the selection of interactions has now been re-engineered into a “reverse” system for selection against protein-protein interactions (White, M. A. (1996), “The yeast two-hybrid system: Forward and reverse,” Proceedings of the National Academy of Sciences USA 93, pp. 10001–3). A counter-selectable yeast strain carrying the URA3 gene behind a modified form of the SPO13 promoter containing GAL4 binding sites was constructed. Activation of URA3 expression by the interaction of proteins X and Y leads to the production of a toxic compound when this strain is grown in the presence of 5-fluoroorotic acid (FOA). Only cells expressing interaction-defective forms of X or Y would display the FOA-resistant phenotype (. This system was used to examine the subunit interactions of the retinoblastoma gene (pRB) product-associated transcription factor E2F/DP. Mutagenesis of E2F and analysis with DP in this system identified a previously uncharacterized interaction domain within E2F1 (Vidal, M., et al. (1996a), “Reverse two-hybrid and one-hybrid systems to detect dissociation of protein-protein and DNA-protein interactions,” Proceedings of the National Academy of Sciences USA 93, pp. 10315–20; Vidal, M., et al. (1996b), “Genetic characterization of a mammalian protein-protein interaction domain by using a yeast reverse two-hybrid system,” Proceedings of the National Academy of Sciences USA 93, pp. 10321–26).
In the second library-based category, diverse peptides have been generated and attached to solid supports by synthetic chemistry. Using the “one bead one-peptide” (Lam, K. S., et al. (1991), “A new type of synthetic peptide library for identifying ligand-binding activity” [published errata appear in Nature 1992 Jul. 30;358(6385): 434 and 1992 Dec. 24–31 ;360(6406): 768], Nature 354, pp. 82–4; Lebl, M., et al. (1995), “One-bead-one-structure combinatorial libraries,” Biopolymers 37, pp. 177–98; Salmon, S. E., et al (1994), “One bead, one chemical compound: use of the selectide process for anticancer drug discovery,” Acta Oncol 33, pp. 27–31) approach, a library containing one to many million individual peptides is generated on resin beads that are permeable to water-soluble target molecules. For example, all possible sequences of a pentapeptide from the twenty natural amino acids would yield 3.2 million different potential ligands. These libraries are prepared on small beads of a solid phase support with application of a split-synthesis method in such a way that each bead contains molecules of a peptide of only one sequence. A prepared library has a statistical distribution of peptide sequences such that all possible peptides are present in approximately the same quantities. A target molecule bound to a specific peptide that is attached to a single bead can be visualized by standard colorimetric methods that differentiate the bound bead from other beads in the library. These visually tagged beads can be removed with microforceps and the sequence of the attached peptides determined using Edman microsequencing. This approach has the potential to yield expanded libraries through the use of non-native amino acids, cyclic peptides, and other polymeric components (Nikolaiev, V., et al. (1993), “Peptide-encoding for structure determination of nonsequenceable polymers within libraries synthesized and tested on solid-phase supports,” Pept Res 6, pp. 161–70).
The third category includes procedures in which mixtures of compounds are prepared and designated for direct testing in solution, i.e., they are not bound to any solid surface when being tested. These libraries are prepared in approximately the same way as libraries bound to solid phase supports (Houghten, R. A. (1993), “The broad utility of soluble peptide libraries for drug discovery,” Gene 137, pp. 7–11). Subsequently, libraries are split off from the solid-phase support and further used as mixtures in aqueous solutions. An advantage of this group of libraries is the possibility of testing in solution using standard pharmacological procedures, and also the possibility of using arbitrary building elements. A disadvantage is the time-consuming iterative procedure of searching for active sequences. Interestingly, a modification of the one-bead/one-peptide approach has been developed in which peptides can be released from the bead-combining the advantages of both soluble and solid phase peptide libraries. In this system, each bead within a library of beads has one peptide sequence, but peptide molecules are attached to the bead with three types of chemical linkers, including two linkers cleavable at different pH optima. An uncleavable linker keeps some peptides attached to the bead for sequencing positives from the solution assay (Salmon, S. E., et al. (1993), “Discovery of biologically active peptides in random libraries: solution-phase testing after staged orthogonal release from resin beads,” Proc Natl Acad Sci USA 90, pp. 11708–12).