The expression of foreign proteins on the surface of cells and virus particles provides a powerful tool for such diverse activities as obtaining specific antibodies, determining enzyme specificity, exploring protein-protein interactions, and introducing new functions into proteins. Surface display technology is also used for expression cloning, in which the biological function of a cloned gene product is used for selection.
A number of methods have been devised to display peptides and proteins on the surfaces of bacteria and bacteriophages. The surface display of heterologous protein in bacteria has been implemented for various purposes, such as the production of live bacterial vaccine delivery systems (see, for example, Georgiou et al., U.S. Pat. No. 5,348,867; Huang et al., U.S. Pat. No. 5,516,637; Ståhl and Uhlén, Trends Biotechnol. 15:185 (1995)). Bacterial surface display has been achieved using chimeric genes derived from bacterial outer membrane proteins, lipoproteins, fimbria proteins, and flagellar proteins. Bacteriophage display of foreign peptides and proteins has become a powerful tool for generating antigens, identifying peptide ligands, mapping enzyme substrate sites, isolation of high affinity antibodies, and the directed evolution of proteins (see, for example, Phizicky and Fields, Microbiol. Rev. 59:94 (1995); Kay et al., Phage Display of Peptides and Proteins (Academic Press 1996); Lowman, Annu. Rev. Biophys. Biomol. Struct. 26:401 (1997)).
Either bacterial or bacteriophage surface display systems can be used for expression screening. Both approaches, however, share certain drawbacks for expressing eukaryotic proteins. Prokaryotic cells do not efficiently express functional eukaryotic proteins, and these cells lack the ability to introduce post-translational modifications, including glycosylation. Moreover, bacterial and bacteriophage display systems are limited by the small capacity of the display system, and as such, are more suited for the display of small peptides.
There are a limited number of reports on the eukaryotic cell surface display of heterologous proteins. Boder and Wittrup, Nature Biotechnol. 15:553 (1997), have described a library screening system using Saccharomyces cerevisiae as the displaying particle. This yeast surface display method uses the α-agglutinin yeast adhesion receptor, which consists of two subunits, Aga1 and Aga2. The Aga1 subunit is anchored to the cell wall via a β-glucan covalent linkage, and Aga2 is linked to Aga1 by disulfide bonds. In this approach, recombinant yeast are produced that express Aga1 and an Aga2 fusion protein comprising a foreign polypeptide at the C-terminus of Aga2. Aga1 and the fusion protein associate within the secretory pathway of the yeast cell, and are expressed on the cell surface as a display scaffold.
Various approaches in eukaryotic systems achieve surface display by producing fusion proteins that contain the polypeptide of interest and a transmembrane domain from another protein to anchor the fusion protein to the cell membrane. In eukaryotic cells, the majority of secreted proteins and membrane-bound proteins are translocated across an endoplasmic reticulum membrane concurrently with translation (Wicker and Lodish, Science 230:400 (1985); Verner and Schatz, Science 241:1307 (1988); Hartmann et al., Proc. Nat'l Acad. Sci. USA 86:5786 (1989); Matlack et al., Cell 92:381 (1998)). In the first step of this co-translocational process, an N-terminal hydrophobic segment of the nascent polypeptide, called the “signal sequence,” is recognized by a signal recognition particle and targeted to the endoplasmic reticulum membrane by an interaction between the signal recognition particle and a membrane receptor. The signal sequence enters the endoplasmic reticulum membrane and the following nascent polypeptide chain begins to pass through the translocation apparatus in the endoplasmic reticulum membrane. The signal sequence of a secreted protein or a type I membrane protein is cleaved by a signal peptidase on the luminal side of the endoplasmic reticulum membrane and is excised from the translocating chain. The rest of the secreted protein chain is released into the lumen of the endoplasmic reticulum. A type I membrane protein is anchored in the membrane by a second hydrophobic segment, which is usually referred to as a “transmembrane domain.” The C-terminus of a type I membrane protein is located in the cytosol of the cell, while the N-terminus is displayed on the cell surface.
In contrast, certain proteins have a signal sequence that is not cleaved, a “signal anchor sequence,” which serves as a transmembrane segment. A signal anchor type I protein has a C-terminus that is located in the cytosol, which is similar to type I membrane proteins, whereas a signal anchor type II protein has an N-terminus that is located in the cytosol.
Several insect cell systems have been devised to express a fusion protein comprising a foreign amino acid sequence and a transmembrane domain. In one system, an expression vector was designed to allow fusion of a heterologous protein to the amino-terminus of the Autographa californica nuclear polyhedrosis virus major envelop glycoprotein, gp64 (Mottershead et al., Biochem. Biophys. Res. Commun. 238:717 (1997)). Gp64, a type I integral membrane protein, functions as an anchor for the heterologous amino acid sequence, which is displayed on the surface of baculovirus particles (Monsma and Blissard, J. Virol. 69:2583 (1995)). More recently, Ernst et al., Nucl. Acids Res. 26:1718 (1998), described a baculovirus surface display system for the production of an epitope library. In this case, a nucleotide sequence encoding a particular epitope was inserted into an influenza virus hemagglutinin gene. Influenza virus hemagglutinin, like gp64, is a type I integral membrane protein, which provides a membrane anchor for the foreign amino acid sequence (see, for example, Lamb and Krug, “Orthomyxoviridae: The Viruses and Their Replication,” in Fundamental Virology, 3rd Edition, pages 606–647 (Lippincott-Raven Publishers 1996)).
While both yeast and insect systems are useful for expressing eukaryotic polypeptides, post-translational modification of mammalian proteins in these systems does not necessarily produce proteins that are similar to those produced by mammalian cells. Accordingly, researchers are interested in developing display systems that use mammalian cells.
Cell surface display methods have been used to select molecules that encode proteins having a signal sequence or a transmembrane domain. For example, several techniques rely upon selection for nucleic acid fragments encoding a signal sequence to identify cDNA molecules that encode secreted proteins or type I membrane proteins (see, for example, Tashiro et al., Science 261:600 (1993); Yokoyama-Kobayashi et al., Gene 163:193 (1995)). According to these methods, a 5′-terminal fragment of the test cDNA is fused to a reporter gene, and the construct is introduced into cultured cells. If the fusion protein has a functional signal sequence, the product of the reporter gene will be detected in the cell membrane or in the culture medium. Similarly, Davis et al., Science 266:816 (1994), described an expression cloning method in which cDNA molecules encoding membrane-bound ligands were transfected into mammalian cells. Cells that expressed a membrane-bound ligand of interest were localized using detectably labeled soluble receptors, and cDNA encoding the ligand was rescued from the labeled cells.
In a related selection approach, Yokoyama-Kobayashi et al., Gene 228:161 (1999), described a method to test whether a hydrophobic sequence located near the N-terminus of a protein functions as a type II signal anchor. Here, a cDNA fragment containing the putative type II signal anchor of a target gene was fused to the 5′-end of a reporter gene. Transfected cells expressed the fusion protein on the cell surface.
Skarnes et al., Proc. Nat'l Acad. Sci. USA 92:6592 (1995), described a gene trap method that relies upon capturing the N-terminal signal sequence of an endogenous gene to generate an active β-galactosidase fusion protein, which is active in the cytosol, but not in the lumen of the endoplasmic reticulum (also see, Skarnes, U.S. Pat. No. 5,767,336). Briefly, a vector was designed that expressed a fusion protein containing a transmembrane domain of a type I membrane protein and β-galactosidase. The vector was introduced into cultured mammalian cells and allowed to integrate into the genome. Insertion of the vector into genes that contain a signal sequence produced a fusion protein that is inserted into the endoplasmic reticulum membrane in a type I configuration. The presence of the signal sequence results in an active β-galactosidase moiety that is located in the cytosol. In contrast, insertion of the vector into a gene that lacks a signal sequence results in a fusion protein that is inserted into the endoplasmic reticulum membrane in a type II orientation. Skarnes et al. suggested that, in the absence of a signal sequence, the transmembrane domain of the fusion protein acts a signal anchor sequence. Since the β-galactosidase moiety of the fusion protein is not located in the cytosol, β-galactosidase activity is lost. A modification of this approach requires an expression vector comprising a chimeric gene that contains a secretory lumen-sensitive indicator marker and a type II secretory protein transmembrane domain that is positioned N-terminally of the marker (Skarnes, U.S. Pat. No. 5,789,653).
Thus, the methods of Skarnes et al. rely upon the presence of a signal sequence in the target protein to correct a membrane orientation imposed by an exogenous transmembrane domain. A foreign transmembrane domain can also be used to force expression of proteins to the surface of mammalian cells. For example, Yang, U.S. Pat. No. 5,665,590, described a method for cloning genes or gene fragments that encode cell surface proteins or secreted proteins. In this approach, a cDNA library is cloned into expression vectors that encode an identifiable marker and a membrane anchoring segment. If a cloned cDNA molecule encodes a polypeptide having a signal sequence, then cells producing the encoded polypeptide should express the polypeptide and the identifiable marker as a cell surface protein attached by the membrane anchoring segment. This method requires the insertion of a cDNA molecule, which includes an intact 5′-end, upstream of nucleotide sequences encoding the identifiable marker and the membrane anchoring segment.
pDisplay™ is an example of a commercially available vector that is used to display a polypeptide on the surface of a mammalian cell (INVITROGEN Corp.; Carlsbad, Calif.). In this vector, a multiple cloning site resides between sequences that encode two identifiable peptides, hemagglutinin A and myc epitopes. The vector also includes sequences that encode an N-terminal signal peptide derived from a murine immunoglobulin κ-chain, and a type I transmembrane domain of platelet-derived growth factor receptor, located and the C-terminus. In this way, a protein of interest is expressed by a transfected cell as an extracellular fusion protein, anchored to the plasma membrane at the fusion protein C-terminus by the transmembrane domain.
Methods that rely upon the selection of certain features, such as a signal sequence or transmembrane domain, cannot be used to isolate genes encoding all types of proteins. Moreover, these methods require that the cloned gene or gene fragment includes an intact 5′-end that encodes the signal sequence. While more generally useful for displaying cloned genes, the pDisplay™ vector has a number of drawbacks. For example, the cloned gene will be expressed as an internal segment of a fusion protein, which means that both ends of the cloned gene must be inserted in-frame with the expression vector. Consequently, the vector is most suited for the display of a protein encoded by a known nucleotide sequence that can be engineered to produce the displayed fusion protein. In addition, the pDisplay™ vector is not well suited for the display of representative full-length libraries. This is so because the polypeptide encoded by the cDNA must be configured as an internal fusion protein, which means that the cloned cDNA must not contain the endogenous translation termination codon, located at the C-terminus of the gene. The pDisplay™ vector system, therefore, is best suited for cloning randomly primed cDNA molecules, which are shorter and are not representative of full-length cDNA libraries.
Accordingly, a need still exists for a simple method for expressing any polypeptide, and especially a full-length protein, in a cell surface display system.