Proteins destined for transport into or across cell membranes are usually translated with a signal sequence that directs the newly synthesized protein to the appropriate membrane translocation system. The primary structure of signal sequences is highly variable among different proteins. Signal sequences that target proteins for export from the cytosol generally contain a short stretch (7–20 residues) of hydrophobic amino acids. In most cases, the signal sequence is located at the amino terminus of a nascent protein and is proteolytically removed on the trans side of the membrane (e.g. lumen of endoplasmic reticulum, bacterial periplasm, intercisternal space of mitochondria and chloroplasts), although examples of mature proteins containing uncleaved or internal signal sequences have been described. Export signal sequences may be interchanged among different proteins, even proteins of different species of organisms.
Many secreted proteins interact with target cells to bring about physiological responses such as growth, differentiation and/or activation. These activities make secreted proteins biologically interesting molecules that are potentially valuable as therapeutics or as targets for ligands. Of the estimated 60,000 to 100,000 human genes, about 25% carry a signal peptide and about 4% are secreted extracellularly. Clearly, approaches to rapidly and accurately identifying secreted proteins are important components of gene-based drug discovery programs.
With advances in techniques for sequencing cDNAs, many expressed sequence tags (ESTs) have been generated which have enhanced the process of identifying novel secreted proteins as compared to the conventional reverse genetics approaches. However, ESTs are small random cDNA sequences and thus it becomes hard to identify secretion signal sequence that is normally present in the 5′ end of cDNA encoding secreted protein. Moreover, after an EST carrying a potential secretion signal sequence is identified based on the homology search, it has to be authenticated in a functional assay. Thus a means for selection for the biochemical function of the proteins encoded by inserted cDNA would greatly simplify the process of obtaining novel secreted genes.
Secretion signal trap is one such method to clone 5′ ends of cDNAs encoding secreted proteins from a random cDNA library. Generally, signal trapping relies on secretion of a reporter polypeptide by signal sequences present in a cDNA library. The secreted reporter polypeptide may then be detected by a variety of assays based upon, for example, growth selection, enzymatic activity, or immune reactivity. Examples of signal trap cloning procedures include those in U.S. Pat. No. 5,536,637 and Klein et al. Proc. Natl. Acad. Sci. USA 93, 7108–13 (1996), which describe signal trap cloning in yeast using the yeast invertase polypeptide as a reporter. Furthermore, Imai et al. J. Biol. Chem. 271, 21514–21 (1996) describes signal trap cloning in mammalian cells using CD4 as a reporter and identifying signal sequences by screening for surface expression of CD4 antigen. In addition, U.S. Pat. No. 5,525,486, Shirozu et al. Genomics 37, 273–80 (1996) and Tashiro et al. Science 261, 600–03 (1993) describe signal trap cloning in mammalian cells and identify signal sequences by screening for surface expression of IL-2 receptor fusion proteins. None of these references teaches cloning in prokaryotic cells.
Signal sequence trapping using mammalian cells has disadvantages, including low transfection efficiency, relatively expensive culture medium, and difficult recovery of vector-borne cDNA sequences from cells that have been transfected. Signal sequence trapping using yeast cells also has the disadvantage of slow growth time as compared to bacterial cells. Further, methods for molecular cloning in yeast cells are generally more complicated than bacterial methods. By contrast, bacterial cells have the advantages of fast doubling times, high transformation efficiencies, and ease of use, as compared to both mammalian and yeast cells, accommodating a wider range of experience levels in the laboratory.
U.S. Pat. No. 5,037,760 describes signal trap cloning in Bacillus using α-amylase and β-lactamase as reporter genes. This patent teaches vectors for identifying secretory signal sequences from DNA fragments of unicellular microorganisms. It does not teach identifying signal sequences in complex eukaryotic organisms.
Sibakov et al. (1991) Appl. Environ. Microbiol. 57: 341–48 and Chubb et al. (1998) Microbiology 144: 1619–29 describe cloning of prokaryotic signal sequences using β-lactamase fusions. Sibakov, et al. and Chubb, et al. do not describe a screening strategy for detection of eukaryotic signal sequences using selection in a prokaryotic system.
Kolmar et al. (1992) J. Mol. Biol. 228: 359–365, Seehaus et al Gene 114: 235–37, Sutter et al. Mol. Microbiol. 6: 2201–2208, and Palzkill et al. (1994) J. Bacteriol. 176: 563–68 utilize β-lactamase fusions in the study of specific biological processes rather than as a means of cloning novel cDNAs on a large scale.
Chen and Leder (1999) Nucleic Acids Res. 27: 1219–22 and Lee et al. (1999) J. Bacteriol. 181: 5790–99 utilize color change from alkaline phosphatase activity during colony formation as a screening mechanism. Thus, a subjective determination of color changes is required for selection using these systems.
Although many of the above references describe the utility of fusions of various cDNA sequences to a β-lactamase sequence, none present a library-screening strategy for detection of eukaryotic signal sequences using selection in a prokaryotic system. Further, none of the aforementioned systems incorporate a single, degenerate primer-based polymerase chain reaction (PCR) strategy designed to clone novel gene family members.
Thus, there is a need to develop alternative approaches for rapid and accurate identification of novel secreted eukaryotic proteins using bacterial host cells.